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HUMAN GENE 
TECHNICAL FIELD 
The present invention relates to a gene useful 
as an indicator in the prophylaxis^ diagnosis and 
treatment of diseases in humans. More particularly, it 
relates to a novel human gene analogous to rat, mouse, 
yeast, nematode and known human genes, among others, and 
utilizable, after cDNA analysis thereof, chromosome 
mapping of cDNA and function analysis of cDNA, in gene 
diagnosis using said gene and in developing a novel 
therapeutic method. 

BACKGROUND ART 

The genetic information of a living thing has 

been accumulated as sequences (DNA) of four bases, namely 

A, C, G and T, which exist in cell nuclei. Said genetic 

information has been preserved for line preservation and 

ontogeny of each individual living thing. 

In the case of human being, the nvimber of said 

g 

bases is said to be about 3 billion (3 x 10 ) and 
supposedly there are 50 to 100 thousand genes therein. 
Such genetic information serves to maintain biological 
phenomena in that regulatory proteins, structural 
proteins and enzymes are produced via such route that 
mRNA is transcribed from a gene (DNA) and then trans 
lated into a protein. Abnormalities in said route from 



gene to protein translation are considered to be 
causative of abnormalities of life supporting systems, 
for example In cell proliferation and differentiation, 
hence causative of various diseases. 

As a result of gene analyses so far made, a 
number of genes .which may be expected to serve as useful 
materials In drug development, have been found, for 
example genes for various receptors such as Insulin 
receptor and LDL receptor, genes Involved in cell 
proliferation and differentiation and genes for metabolic 
enzymes such as proteases, ATPase and superoxide 
dismutases • 

However, analysis of human genes and studies of 
the functions of the genes analyzed and of the relations 
between the genes analyzed and various diseases have been 
Just begun and many points remain unknown. Further 
analysis of novel genes, analysis of the functions 
thereof, studies of the relations between the genes 
analyzed and diseases, and studies for applying the genes 
analyzed to gene diagnosis or for medicinal purposes, for 
instance, are therefore desired in the relevant art. 

If such a novel human gene as mentioned above 
can be provided, it will be possible to analyze the level 
of expression thereof in each cell and the structure and 
function thereof and, through expression product analysis 



and other st:udies. It: may become possible t:o reveal the 
pathogenesis of a disease associated therewith^ for 
example a genopathy or cancer, or diagnose and treat said 
disease, for instance. It is an object of the present 
invention to provide such a novel human gene. 

For attaining the above object, the present 
Inventors made intensive investigations and obtained the 
findings mentioned below. Based thereon, the present 
invention has now been completed. 

DISCLOSURE OF INVENTION 
Thus, the present inventors synthesized cDNAs 
based on mRNAs extracted from various tissues, inclusive 
of human fetal brain, adult blood vessels and placenta, 
constructed libraries by inserting them into vectors, 
allowing colonies of Escherichia coll transformed with 
said libraries to form on agar medium, picked up colonies 
at random and transferred to 96-well micro plates and 
registered a large number of human gene-containing E. 
coll clones. 

Each clone thus registered was cultivated on a 
small size, DNA was extracted and purified, the four 
base-specifically terminating extension reactions were 
carried out by the dideoxy chain terminator method using 
the cDNA extracted as a template, and the base sequence 
of the gene was determined over about 400 bases from the 



5' terminus thereof using an au-tomatic DNA sequencer. 
Based on the thus-obtained base sequence information, a 
novel family gene analogous to known genes of animal and 
plant species such as bacteria, yeasts, nematodes, mice 
and humans was searched for. 

The method of the above-mentioned cDNA analysis 
is detailedly described in the literature by Fujiwara, 
one of the present inventors [Fujiwara, Tsutomu, Saibo 
Kogaku (Cell Engineering), 14, 645-654 (1995)]. 

Among this group, there are novel receptors, 
DNA binding domain-containing transcription regulating 
factors, signal transmission system factors, metabolic 
enzymes and so forth. Based on the homology of the novel 
gene of the present invention as obtained by gene 
analysis to the genes analogous thereto, the product of 
the gene, hence the function of the protein, can 
approximately be estimated by analogy. Furthermore, such 
functions as enzyme activity and binding ability can be 
investigated by inserting the candidate gene into an 
expression vector to give a recombinant. 

According to the present invention, there are 
provided a novel human gene characterized by containing a 
nucleotide sequence coding for an amino acid sequence 
defined by SEQ ID NO:l, :4, :7, :10, :13, :16, :19, :22, 
:25, :28, :31, :34, :37 or 40, a human gene characterized 



by containing the nucleotide sequence defined by SEQ ID 
NO:2, :5^ :8^ :11^ il4, zl7, z20, :23, z26, i29, z32, 
i35, :38 or :41^ respectively coding for the amino acid 
sequence mentioned above, and a novel human gene 
characterized by the nucleotide sequence defined by SEQ 
ID NO:3, z6, z9, :12, :15, :18, :21, :24, z27 , :30, :33, 
:36, :39 or :42. 

The symbols used herein for indicating amino 
acids, peptides, nucleotides^ nucleotide sequences and so 
on are those recommended by lUPAC and lUB or in "Guide- 
line for drafting specifications etc. including 
nucleotide sequences or amino acid sequences" (edited by 
the Japanese Patent Office), or those in conventional use 
in the relevant field of art. 

As specific examples of such gene of the 
present invention, there may be mentioned genes deducible 
from the DNA sequences of the clones designated as "GEN- 
501D08", "GEN-080G01", "GEN-025F07" , "GEN-076C09 " , "GEN- 
331G07", "GEN-163D09-, "GEN-078D05TA13" , "GEN-423A12" , 
"GEN-092E10", "GEN-428B12" , "GEN-073E07" , "GEN-093E05" 
and "GEN-077A09" shown later herein in Examples 1 to 11. 
The respective nucleotide sequences are as shown in the 
sequence listing. 

These clones have an open reading frame 
comprising nucleotides (nucleic acid) respectively coding 
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for -the amino acids shown in the sequence listing. Their 
molecular weights were calculated at the values shown 

r 

later herein in the respective examples. Hereinafter, 
these human genes of the present invention are sometimes 
5 referred to as the designation used in Examples 1 to 11. 

In the following, the hiunan gene of the present 
invention is described in further detail. 

As mentioned above, each human gene of the 
present invention is analogous to rat, mouse, yeast, 
10 nematode and known human genes, among others, and can be 
utilized in human gene analysis based on the information 
about the genes analogous thereto and in studying the 
function of the gene analyzed and the relation between 
the gene analyzed and a disease. It is possible to use 
15 said gene In gene diagnosis of the disease associated 
therewith and In exploitation studies of said gene for 
medicinal purposes. 

The gene of the present Invention . is 
represented in terms of a single-stranded DNA sequence, 
20 as shown under SEQ ID NO: 2. It is to be noted, however, 
that the present invention also includes a DNA sequence 
complementary to such a single-stranded DNA sequence and 
a component comprising both. The sequence of the gene of 
the present Invention as shown under SEQ ID NO: 3n - 1 
25 (where n is an Integer of 1 to 14) is merely an example 



of the codon comblna1:ion encoding the respective amino 
acid residues. The gene of the present invention is not 
limited thereto but can of course have a DNA sequence In 
which the codons are arbitrarily selected and combined 
for the respective amino acid residues. The codon 
selection can be made in the conventional manner, for 
example taking Into consideration the codon utilization 
frequencies In the host to be used [Nucl. Acids Res., 9., 
43-74 (1981)] . 

The gene of the present Invention further 
Includes DNA sequences coding for functional equivalents 
derived from the amino acid sequence mentioned . above by- 
part ial amino acid or amino acid sequence substitution, 
deletion or addition. These polypeptides may be produced 
by spontaneous modification (mutation) or may be obtained 
by posttranslational modification or by modifying the 
natural gene (of the present Invention) by a technique of 
genetic engineering, for example by site-specific 
mutagenesis [Methods in Enzymology, 154 . p. 350, 367-382 
(1987); ibid-. 100 ^ p. 468 (1983); Nucleic Acids 
Research, 12, p. 9441 (1984); Zoku Seikagaku Jikken Koza 
(Sequel to Experiments in Biochemistry) 1, "Idensl 
Kenkyu-ho (Methods in Gene Research) II", edited by the 
Japan Biochemical Society, p. 105 (1986)] or synthesizing 
mutant DNAs by a chemical synthetic technique such as the 
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phosphotriester method or phosphoamidite method [J. Am. 
Chem. Soc., 89, p. 4801 (1967); ibid . , 91 , p- 3350 
(1969); Science, 150 . p. 178 (1968); Tetrahedron Lett., 
22, p. 1859 (1981); ibid-. P- 245 (1983)], or by 

5 utilizing the techniques mentioned above in combination. 

The protein encoded by the gene of the present 
invention can be expressed readily and stably by 
utilizing said gene, for example inserting it into a 
vector for use with a microorganism and cultivating the 
10 microorganism thus transformed* 

The protein obtained by utilizing the gene of 
the present invention can be used in specific antibody 
production. In this case, the protein producible in 
large quantities by the genetic engineering technique 
15 mentioned above can be used as the component to serve as 
an antigen. The antibody obtained may be polyclonal or 
monoclonal and can be advantageously used in the 
purification, assay, discrimination or iden^tif ication of 
the corresponding protein. 
20 The gene of the present invention can be 

readily produced based on the sequence information 
thereof disclosed herein by using general genetic 
engineering techniques [of. e.g. Molecular Cloning, 2nd 
Ed., Cold Spring Harbor Laboratory Press (1989); Zoku 
25 Seikagaku Jikken Koza, "Idenshi Kenkyu-ho I, II and III", 



edited by the Japan Biochemical Society (1986)]. 

This can be achieved, for example, by selecting 
a desired clone from a human cDNA library (prepared in 
the conventional manner from appropriate cells of origin 
in which the gene is expressed) using a probe or antibody 
specific to the gene of the present invention [e.g« Proc. 
Natl. Acad. Sci. USA, 78, 6613 (1981); Science, 222 , 778 
(1983)] . 

The cells of origin to be used in the above 
method are, for example, cells or tissues in which the 
gene in question is expressed, or cultured cells derived 
therefrom. Separation of total RNA, separation and 
purification of mRNA, conversion to (synthesis of) cDNA, 
cloning thereof and so on can be carried out by 
conventional methods. cDNA libraries are also commer- 
cially available and such cDNA libraries, for example 
various cDNA libraries available from Clontech Lab. Inc. 
can also be used in the above method . 

Screening of the gene of the present invention 
from these cDNA libraries can be carried out by the 
conventional method mentioned above. These screening 
methods include, for example, the method comprising 
selecting a cDNA clone by immunological screening using 
an antibody specific to the protein produced by the 
corresponding cDNA, the technique of plaque or colony 
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hybridizat:ion using probes selectively binding to the. 
desired DNA sequence, or a combination of these. As 
regards the probe to be used here, a DNA sequence 
chemically synthesized based on the Information about the 
DNA sequence of the present invention is generally used. 
It Is of course -possible to use the gene of the present 
Invention or fragments thereof as the proble. 

Furthermore, a sense primer and an antisense 
primer designed based on the Information about the 
partial amino acid sequence of a natural extract isolated 
and purified from cells or a tissue can be used as probes 
for screening. 

For obtaining the gene of the present 
invention, the technique of DNA/RNA amplification by the 
PGR method [Science, 230 . 1350-1354 (1984)] can suitably 
be employed. Particularly when the full-length cDNA can 
hardly be obtained from the library, the RACE method 
(rapid amplification of cDNA ends; Jlkken .Igaku 
(Experimental Medicine), 12 (6), 35-38 (1994)], in 
particular the 5 'RACE method [Frohman, M. A., et al., 
Proc. Natl. Acad. Sci. USA, 85, 8998-9002 (1988)] is 
preferably employed. The primers to be used in such PCR 
method can be appropriately designed based on the 
sequence information of the gene of the present invention 
as disclosed herein and can be synthesized by a 
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conven-tional me-thod. 

The amplified DNA/RNA fragment can be isolated 
and purified by a conventional method as mentioned above, 
for example by gel electrophoresis. 

The nucleotide sequence of the thus-obtained 
gene of the present invention or any of various DNA 
fragments can be determined by a conventional method, for 
example the dideoxy method [Proc. Natl. Acad. Sci. USA, 
74 . 5463-5467 (1977)] or the Maxam-Gilbert method 
[Methods in Enzymology, 65, 499 (1980)]. Such nucleotide 
sequence determination can be readily performed using a 
commercially available sequence kit as well. 

When the gene of the present invention is used 
and conventional techniques of recombinant DNA technology 
[see e.g. Science, 224, p. 1431 (1984); Biochem. Biophys. 
Res. Comm., 130 . p. 692 (1985); Proc. Natl. Acad. Sci. 
USA, 80, p. 5990 (1983) and the references cited above] 
are followed, a recombinant protein can be obtained. 
More detailedly, said protein can be produced by 
constructing a recombinant DNA enabling the gene of the 
present invention to be expressed in host cells, 
introducing it into host cells for transformation thereof 
and cultivating the resulting transformant • 

In that case, the host cells may be eukaryotic 
or prokaryotic. The eukaryotic cells include vertebrate 
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cells, yeast cells and so on^ and the vertebrate cells 
Include, but are not limited to, simian cells named COS 
cells [Cell, 23, 175-182 (1981)], Chinese hamster ovary 
cells and a dihydrofolate reductase-def icient cell line 
derived therefrom [Proc. Natl- Acad, Sci. USA, 77, 4216- 
4220 (1980)] and the like, which are frequently used- 
As regards the expression vector to be used 
with vertebrate cells, an expression vector having a 
promoter located upstream of the gene to be expressed, 
RNA splicing sites, a polyadenylation site and a 
transcription termination sequence can be generally used- 
This may further have an origin of replication as 
necessary- As an example of said expression vector, 
there may be mentioned pSV2dhfr [Mol- Cell- Biol., i, 854 
(1981)], which has the SV40 early promoter- As for the 
eukaryotic microorganisms, yeasts are generally and 
frequently used and, among them, yeasts of the genuis 
Saccharomvces can be used with advantage- .As regards the 
expression vector for use with said yeasts and other 
eukaryotic microorganisms, pAM82 [Proc. Natl- Acad- Sci. 
USA, 80, 1-5 (1983)], which has the acid phosphatase gene 
promoter, for instance, can be used. 

Furthermore, a prokaryotic gene fused vector 
can be preferably used as the expression vector for the 
gene of the present invention. As specific examples of 
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sald vector, there may be mentioned pGEX-2TK and pGEX-4T- 
2 which have a GST domain (derived from S. 1 aponicum ) 
with a molecular weight of 26^000. 

Escherichia coli and Bacillus subtilis are 
generally and preferably used as prokaryotic hosts. When 
these are used as hosts in the practice of the present 
Invention, an expression plasmid derived from a plasmid 
vector capable of replicating in said host organisms and 
provided in this vector with a promoter and the SD (Shine 
and Dalgamo) sequence upstream of said gene for enabling 
the expression of the gene of the present invention and 
further provided with an initiation codon (e.g. ATG) 
necessary for the initiation of protein synthesis is 
preferably used. The Escherichia coli strain K12, among 
others, is preferably used as the host Escherichia coli ^ 
and pBR322 and modified vectors derived therefrom are 
generally and preferably used as the vector, while 
various known strains and vectors can also be used. 
Examples of the promoter which can be used are the 
tryptophan (trp) promoter, Ipp promoter, lac promoter and 
PL/PR promoter. 

The thus -obtained desired recombinant DNA can 
be introduced into host cells for transformation by using 
various general methods. The transformant obtained can. 
be cultured by a conventional method and the culture 
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leads to expression and production of the desired protein 
encoded by the gene of the present invention. The medium 
to be used in said culture can suitably be selected from 
among various media in conventional use according to the 
host cells employed. The host cells can be cultured 
under conditions suited for the growth thereof. 

In the above manner, the desired recombinant 
protein is expressed and produced and accumulated or 
secreted within the transformant cells or extracellularly 
or on the cell membrane • 

The recombinant protein can be separated and 
purified as desired by various separation procedures 
utilizing the physical, chemical and other properties 
thereof [cf . e.g. "^Seikagaku (Biochemistry) Data Book 
11"^ pages 1175-1259, 1st Edition, 1st Printing, 
published June 23, 1980 by Tokyo Kagaku DoJJLn; Bio- 
chemistry, 25 (25), 8274-8277 (1986); Eur. J. Biochem. > 
163 , 313-321 (1987)]. Specifically, said procedures 
include, among others, ordinary recons titut ion treatment , 
treatment with a protein precipitating agent (salting 
out), centrifugation, osmotic shock treatment, 
sonication, ultrafiltration, various liquid chromato- 
graphy techniques such as molecular sieve chromatography 
(gel filtration), adsorption chromatography, ion exchange 
chromatography, affinity chromatography and high- 



performance liquid chromatography (HPLC)^ dialysis and 
combinations thereof. Among them, affinity chromato- 
graphy utilizing a column with the desired protein bound 
thereto is particularly preferred. 

Furthermore, on the basis of the sequence 
information about the gene of the present invention as 
revealed by the present invention, for example by 
utilizing part or the whole of said gene, it is possible 
to detect the expression of the gene of the present 
invention in various human tissues. This can be 
performed by a conventional method, for example by RNA 
amplification by RT-PCR (reverse transcribed-polymerase 
chain reaction) [Kawasaki, E. S., et al.. Amplification 
of RNA, in PGR Protocol, A guide to methods and 
applications. Academic Press, Inc., San Diego, 21-27 
(1991)], or by northern blotting analysis [Molecular 
Cloning, Cold Spring Harbor Laboratory (1989)], with good 
results • 

The primers to be used in employing the above- 
mentioned PCR method are not limited to any particular 
ones provided that they are specific to the gene of the 
present invention and enable the gene of the present 
invention alone to be specifically amplified. They can 
be designed or selected apropriately based on the gene 
information provided by the present invention. They can 



have a partial sequence comprising about: 20 to 30 
nucleotides according to the established practice* 
Suitable examples are as shown in Examples 1 to 11* 
Thus, the present invention also provides 
primers and/or probes useful in specifically detecting 
such novel gene^r 

By using the novel gene provided by the present 
invention, it is possible to detect the expression of 
said gene in various tissues, analyze the structure and 
function thereof^ and, further, produce the human protein 
encoded by said gene in the manner of genetic 
enginnering* These make it possible to analyze the 
expression product, reveal the pathology of a disease 
associated therewith, for example a genopathy or cancer, 
and diagnose and treat the disease. 

The following drawings are refe:^ed to in the 

examples. 

Fig. 1 shows the result obtained- by testing the 
PI4 kinase activity of NPIK in Example 9. Fig. 2 shows 
the effect of Triton X-100 and adenosine on NPIK 
activity. 

EXAMPLES 

The following examples illustrate the present 
invention in further detail. 

Example 1 
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GDP dissociation stimulator gene 

(1) Cloning and DNA sequencing of GDP dissociation 
stimulator gene 

mRNAs extracted from the tissues of human fetal 
brain, adult blood vessels and placenta were purchased 
from Clontech and used as starting materials. 

cDNA was synthesized from each mRNA and 
inserted into the vector XZAPII (Stratagene) to thereby 
construct a cDNA library (Otsuka GEN Research Institute, 
Otsuka Pharmaceutical Co., Ltd.) 

Human gene-containing Escherichia coli colonies 
were allowed to form on agar medium by the in vivo 
excision technique [Short, J. M., et al.. Nucleic Acids 
Res., 16, 7583-7600 (1988)]. Colonies were picked up at 
random and humsin gene-containing Escherichia coli clones 
were registered on 96-well micro plates. The clones 
registered were stored at -80 ^C. 

Each of the clones registered was cultured 
overnight in 1.5 ml of LB medium, and DNA was extracted 
and purified using a model PI-100 automatic plasmid 
extractor ( Kurabo ) . Contaminant Escherichia coli RNA was 
decomposed and removed by RNase treatment. The DNA was 
dissolved to a final volume of 30 pi. A 2-pl portion was 
used for roughly checking the DNA size and quantity using 
a minigel, 7 \xl was used for sequencing reactions and the 
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remainlng portion (21 pi) was stored as plasmld DNA at 
4«C. 

Thls me-thod^ after slight changes in the 

program, enables extraction of the cosmid, which is 

useful also as a probe for FISH (fluorescence in situ 

hybridization) shown later in the examples. 

Then, the dideoxy terminator method of Sanger 

et al* [Sanger, , et al., Proc. Natl. Acad. Sci. USA, 

74, 5463-5467 (1977)] using T3, T7 or a synthetic 

oligonucleotide primer or the cycle suquence method 

[Carothers, A. M. , et al.. Bio. Techniques, 7, 494-499 

(1989)] comprising the dideoxy chain terminator method 

plus PCR method was carried out. These are methods of 

terminating the extension reaction specifically to the 

four bases using a small amount of plasmid DNA (about 0.1 

to 0.5 pg) as a template. 

The sequence primers used were FXTC 

(fluorescein isothiocyanate) -labeled ones. Generally, 

about 25 cycles of reaction were performed using Taq 

polymerase. The PCR products were separated on a 

polyacrylamide urea gel and the fluorescence-labeled DNA 

fragments were submitted to an automatic DNA sequencer 
, TM 

(ALF DNA Sequencer; Pharmacia) for determining the 
sequence of about 400 bases from the 5 • terminus side of 
CDNA. 
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Since the 3' nontranslational region is high in 
heterogeneity for each gene and therefore suited for 
discriminating individual genes from one another, 
sequencing was performed on the 3' side as well depending 
on the situation. 

The vast sum of nucleotide sequence information 
obtained from the DNA sequencer was transferred to a 64- 
bit DEC 3400 computer for homology analysis by the 
computer. In the homology analysis, a data base 
(GenBank, EMBL) was used for searching according to the 
UWGCG FASTA program [Pearson, W. R. and Lipman, D. J., 
Proc. Natl. Acad. Sci. USA, 85/ 2444-2448 (1988)]. 

As a result of arbitraary selection by the above 
method and of cDNA sequence analysis, a clone designated 
as GEN-501D08 and having a 0.8 kilobase insert was found 
to show a high level of homology to the C terminal region 
of the human Ral guanine nucleotide dissociation 
stimulator (RalGDS) gene. Since RalGDS is considered to 
play a certain role in signal transmission pathways, the 
whole nucleotide sequence of the cDNA insert portion 
providing the human homolog was further determined. 

Low-molecular GTPases play an important role in 
transmitting signals for a number of cell functions 
including cell proliferation, differentiation and 
transformation [Bourne, H. R. et al.. Nature, 348, 125- 
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132 (1990); Bourne et al.. Nature, 34£. 117-127 (1991)] • 

11: is well known that, among them, those 

f 

proteins encoded by the ras gene family function as 
molecular switches or, in other words, the functions of 
the ras gene family are regulated by different conditions 
of binding proteins such as biologically inactive GDP- 
binding proteins or active GDP-binding proteins, and that 
these two conditions are induced by GTPase activating 
proteins (GAPs) or GDS. The former enzymes induce GDP 
binding by stimulating the hydrolysis of bound GTP and 
the latter enzyme induces the regular GTP binding by 
releasing bound GDP [Bogusuki, M. S. and McCormick, F., 
Nature, 366/ 643-654 (1993)]. 

RalGDS was first discovered as a member of the 
ras gene family lacking in transforming activity and as a 
GDP dissociation stimulator specific to RAS [Chardin, P. 
and Tavitian, A.^ EMBO J., 5, 2203-2208 (1986); Albright, 
C. F., et al.^ EMBO J., 12, 339-347 (1993)] • 

In addition to Ral, RalGDS was found to 
function, through interaction with these proteins, as an 
effector molecule for N-ras, H-ras, K-ras and Rap 
[Spaargaren, M, and Bischoff, J* R, , Proc. Natl. Acad. 
Sci. USA, 91, 12609-12613 (1994)]. 

The nucleotide sequence of the cDNA clone 
designated as GEN-501D08 is shown under SEQ ID NO: 3, the 
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nucleotide sequence of the coding region of said clone 
under SEQ ID NO: 2^ and the amino acid sequence encoded by 
said nucleotide sequence under SEQ ID NO:l* 

This cDNA comprises 842 nucleotides^ including 
an open reading frame comprising 366 nucleotides and 
coding for 122 amino acids. The translation initiation 
codon was found to be located at the 28th nucleotide 
residue. 

Comparison between the RalGDS protein known 
among conventional databases and the amino acid seq[uence 
deduced from said cDNA revealed that the protein encoded 
by this cDNA is homologous to the C terminal domain of 
human RalGDS. The amino acid sequence encoded by this 
novel gene was found to be 39.5% identical with the C 
terminal domain of RalGDS which is thought to be 
necessary for binding to ras. 

Therefore^ it is presumable, as mentioned 
above, that this gene product might interact with the ras 
family proteins or have influence on the ras-mediated 
signal transduction pathways. However, this novel gene 
is lacking in the region coding for the GDS activity 
domain and the corresponding protein seems to be 
different in function from the GDS protein. This gene 
was named human RalGDS by the present inventors. 
(2) Northern blot analysis 
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The expression of the RalGDS protein mRNA in 
normal human tissues was evaluated by Northern blotting 
using, as a probe ^ the human cDNA clone labeled by the 
random oligonucleotide priming method. 

The Northern blot analysis was carried out with 
a human MTN blot (Human Multiple Tissue Northern blot; 
Clontech, Palo Alto, CA, USA) according to the manufac- 
turer's protocol. 

Thus, the PCR amplification product from the 
above GEN-501D08 clone was labeled with [ P]-dCTP 
(random-primed DNA labeling kit, Boehr inger-Mannheim ) for 
use as a probe. 

For blotting, hybridization was performed 
overnight at 42* C in a solution comprising 50% 
£ormamlde/5 x SSC/50 x Denhardt*s solutlon/0 . 1% SDS 
(containing 100 pg/ml denatured salmon sperm DNA). After 
washing with two portions of 2 x SSC/0.01% SDS at room 
temperature, the membrane filter was further washed three 
times with 0.1 x SSC/0.05% SDS at 50 '•C for 40 minutes. 
An X-ray film (Kodak) was exposed to the filter at -70 
for 18 hours. 

As a result, it was revealed that a 900-bp 
transcript had been expressed in all the human tissues 
tested. In addition, a 3.2-kb transcript was observed 
specifically in the heart and skeletal muscle. The 
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expression of -these transcripts differing in size may be 
due either to alternative splicing or to cross 
hybridization with homologous genes. 

(3) Cosmid clone and chromosome localization by FISH 

FISH was performed by screening a library of 
human chromosomes cloned in the cosmid vector pWElS 
using, as a probe, the 0.8-kb insert of the cDNA clone 
[Sambrook, J., et al.. Molecular Cloning, 2nd Ed,, pp. 
3.1-3.58, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, New York (1989)]. 

FISH for chromosome assignment was carried out 
by the method of Inazawa et al. which comprises G-banding 
pattern comparison for confirmation [Inazawa, J., et al.. 
Genomics, 17, 153-162 (1993)]. 

For use as a probe, the cosmid DNA (0.5 pg) 
obtained from chromosome screening and corresponding to 
GEN-501D08 was labeled with biotin-16-dUTP by nick 
translation. 

To eliminate the background noise due to 
repetitive sequences, 0.5 pi of sonicated human placenta 
DNA (10 mg/ml) was added to 9-5 pi of the probe solution. 
The mixture was denatured at 80 •C for 5 minutes and 
admixed with an equal volume of 4 x SSC containing 20% 
dextransulfate. Then, a denatured slide was sown with 
the hybridization mixture and, after covering with 
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paraffin. Incubated in a wet chamber at 37 "C for 16 to 18 
hours. After washing with 50% formaniide/2 x SSC at 37 ''C 

r 

for 15 minutes, the slide was washed with 2 x SSC for 15 
minutes and further with 1 x SSC for 15 minutes. 

The slide was then incubated in 4 x SSC supple- 
mented with "1% .Block Ace" (trademark; Dainippon Pharma- 
ceutical) containing avidin-FITC (5 pg/ml) at 37 'C for 40 
minutes. Then, the slide was washed with 4 x SSC for 10 
minutes and with 4 x SSC containing 0.05% Triton X-100 
for 10 minutes and immersed in an antifading PPD solution 
[prepared by adjusting 100 mg of PPD (Wako Catalog No. 
164-015321) and 10 ml of PBS(-) (pH 7.4) to pH 8.0 with 
0.5 M Na2CO3/0.5 M NaHCOg (9:1^ v/v) buffer (pH 9.0) and 
adding glycerol to make a total volume of 100 ml] 
containing 1% DABCO [1% DABCO (Sigma) in PBS(- ) tglycerol 
1:9 (v:v)], followed by counter staining with DAPl (4,6- 
diamino-2-phenylindole ; Sigma ) . 

With more than 100 tested cells in the 
metaphase, a specific hybridization signal was observed 
on the chromosome band at 6p21.3, without any signal on 
other chromosomes. It was thus confirmed that the RalGDS 
gene is located on the chromosome 6p21.3. 

By using the novel human RalGDS-associated gene 
of the present invention as obtained in this example, the 
expression of said gene in various tissues can be 
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detected and the human RalGDS protein can be produced in 
the manner o£ genetic engineering. These are expected to 
enable studies on the roles of the expression product 
protein and ras-mediated signals in transduction pathways 
as well as pathological investigations of diseases in 
which these are. involved^ for example cancer, and the 
diagnosis and treatment of such diseases. Furthermore, 
it becomes possible to study the development and progress 
of diseases involving the same chromosomal translocation 
of the RalGDS protein gene of the present invention, for 
example tonic spondylitis, atrial septal defect, 
pigmentary retinopathy, aphasia and the like. 

Example 2 

Cytoskeleton- associated protein 2 gene (CKAP2 gene) 
(1) Cytoskeleton-associated protein 2 gene cloning and 
DNA sequencing ^ 

cDNA clones were arbitrarily chosen from a . 
human fetal brain cDNA library in the same, manner as in 
Example 1 were subjected to sequence analysis and, as a 
result, a clone having a base sequence containing the 
CAP-glycine domain of the human cytoskeleton-associated 
protein (CAP) gene and highly homologous to several CAP 
family genes was found and named GEN-080G01. 

Meanwhile, the cytoskeleton occurs in the 
cytoplasm and just inside the cell membrane of eukaryotic 
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cells and is a network structure comprising complicatedly 
entangled filaments. Said cytoskeleton is constituted of 
microtubules composed of tubulin, microfilaments composed 
of act in, intermediate filaments composed of desmih and 
vimentin, and so on. The cytoskeleton not only acts as 
supportive cellular elements but also isokinetically 
functions to induce morphological changes of cells by 
polymerization and depolymerization in the fibrous 
system. The cytoskeleton binds to intracellular 
organelles, cell membrane receptors and ion channels and 
thus plays an important role in intracellular movement 
and locality maintenance thereof and, in addition, is 
said to have functions in activity regulation and mutual 
information transmission. Thus it supposedly occupies a 
very important position in physiological activity 
regulation of the whole cell. In particular, the 
relation between canceration of cells and qualitative 
changes of the cytoskeleton attracts attention since 
cancer cells differ in morphology and recognition 
response from normal cells. 

The activity of this cytoskeleton is modulated 
by a number of cytoskeleton-associated proteins ( CAPs ) . 
One group of CAPs is characterized by a glycine motif 
highly conserved and supposedly contributing to associ- 
ation with microtubules [CAP-GLY domain; Riehemann, K. 
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and Song^ C. , Trends Biochem. Sci*^ 18, 82-83 (1993)]. 

Among the members of this group of CAPs, there 
are CLIP-170, 150 kDa DAP (dynein-associated protein, or 
dynactin), melanogaster GLUED, S. cerevislae BIKl, 
restin [Bilbe, G., et al., EMBO J., 11, 2103-2113 
(1992)]; Hilliker, C. , et al., Cytogenet. Cell Genet., 
65, 172-176 (1994)1 and C. eleaans 13.5 kDa protein 
[Wilson, R., et al.. Nature, 368 , 32-38 (1994)]. Except 
for the last two proteins, direct or indirect evidences 
have suggested that they could interact with 
microtublues . 

The above-mentioned CLIP- 170 is essential for 
the in vitro binding of endocytic vesicles to 
microtubules eind colocalizes with endocytic organelles 
[Rickard, J. E. and Kreis, T. E., J. Biol. Chem., 18, 82- 
83 (1990); Pierre, P., et al.. Cell, 70, 887-900 (1992)]. 

The above-mentioned dynactin is one of the 
factors constituting the cytoplasmic dynein. motor, which 
functions in retrograde vesicle transport [Schroer, T. A. 
and Sheetz, M. P., J. Cell Biol., 115 . 1309-1318 (1991)] 
or probably in the movement of chromosomes during mitosis 
[Pfarr, C. M. , et al.. Nature, 345 , 263-265 (1990); 
Steuer, E. R. , et al.. Nature, 345, 266-268 (1990); 
Wordeman, L., et al., J. Cell Biol., 114 , 285-294 
(1991)]. 
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GLUED, the DrosoDhila homolog of mammalian 
dynactin, is essential for the viability of almost all 
cells and for the proper organization of some neurons 
[Swaroop, A., et al., Proc. Natl. Acad. Sci. USA, 84 , 
6501-6505 (1987); Holzbaur, E. L. P., et al.. Nature, 
351 , 579-583 (1991)]. 

BlKl Interacts with microtubules and plays an 
Important role In spindle formation during mitosis in 
yeasts [Trueheart, J., et al., Mol. Cell. Biol., 7, 2316- 
2326 (1987); Berlin, V., et al., J. Cell Biol., Ill, 
2573-2586 (1990)]. 

At present, these genes are classified under 
the term CAP family ( CAPs ) . 

As a result of database searching, the above- 
mentioned cDNA clone of 463-bp (excluding the poly-A 
signal) showed significant homology in nucleotide 
sequence with the rest in and CLIP- 170 encoding genes. 
However, said clone was lacking in the 5' region as 
compared with the rest in gene and, therefore, the 
technique of 5' RACE [Frohman, M. A., et al., Proc. Natl. 
Acad. Sci. USA, 85, 8998-9002 (1988)] was used to isolate 
this missing segment. 

(2) 5* RACE (5* rapid amplification of cDNA ends) 

A cDNA clone containing the 5 * portion of the 
gene of the present invention was isolated for analysis 
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by the 5 ' RACE technique using a conunercial kit ( 5 ' -Rapid 
AmpliFinder RACE kit^ Clontech) according to the 
manufacturer's protocol with minor modifications, as 
follows . 

The gene-specific primer PI and primer P2 used 
here were synthesized by the conventional method and 
their nucleotide sequences are as shown below in Table 1. 
The anchor primer used was the one attached to the 
commercial kit. 



Table 1 



Primer 




Nucleo'tide sec[uence 


Primer 


PI 


5 ' -ACACCAATCCAGTAGCCAGGCTT6-3 ' 


Primer 


P2 


5 ' -CACTCGAGAATCTGTGAGACCTACATACATGACG-3 ' 



cDNA was obtained by reverse transcription of 
0*1 \ig of human fetal brain poly(A)+RNA by the random 
hexamer technique using reverse transcript&se 

TM 

(Superscript II, Life Technologies) and the cDNA was 
amplified by the first PGR using the PI primer and anchor 
primer according to Watanabe et al. [Watanabe, T., et 
al.. Cell Genet., in press). 

Thus, to 0.1 |ig of the above-mentioned cDNA 
were added 2.5 mM dNTP/1 x Taq buffer (Takara Shuzo)/0.2 
\iM PI primer, 0.2 yiM adaptor primer/0.25 unit ExTaq 
enzyme (Takara Shuzo) to make a total volume of 50 pi, 
followed by addition of the anchor primer. The mixture 



^ 
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was subjected to PGR. Thus^ 35 cycles of amplification 
were performed under the conditions: 94 •C for 45 seconds, 
60 for 45 seconds, and 72 for 2 minutes. Finally, 
the mixture was heated at 72 for 5 minutes. 

Then, 1 pi of the 50-pl first PGR product was 
subjected to amplification by the second PGR using the 
specific nested P2 primer and anchor primer. The second 
PGR product was analyzed by 1«5% agarose gel 
electrophoresis . 

Upon agarose gel electrophoresis, a single 
band, about 650 nucleotides in size, was detected. The 
product from this band was inserted into a vector 
(pT7Blue(R)T- Vector, Novagen) and a plurality of clones 
with an insert having an appropriate size were selected. 

Six of the 5* RAGE clones obtained from the PGR 
product had the same sequence but had different lengths. 
By sequencing two overlapping cDNA clones, GEN-080G01 and 
GEN-080G0149 , the protein-encoding sequence and 5' and 3' 
flanking sequences, 1015 nucleotides in total length, 
were determined. Said gene was named cytoskeleton- 
associated protein 2 gene (GKAP2 gene). 

The nucleotide sequence obtained from the 
above-mentioned two overlapping cDNA clones GEN-080G01 
and GEN-080G0149 is shown under SEQ ID NO: 6, the 
nucleotide sequence of the coding region of said clone 
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under SEQ ID NO: 5, and the amino acid sequence encoded by 

said nucleotide sequence under SEQ ID NO: 4. 

As shown under SEQ ID NO: 6, the CKAP2 gene had 

a relatively GC-rich 5' noncoding region, with incomplete 

triplet repeats, (CAG)4{CGG)4(CTG) (CGG) , occurring at 

nucleotides 40-69. 

ATG located at nucleotides 274-276 is the 

presumable start codon. A stop codon (TGA) was situated 
at nucleotides 853-855. A polyadenylation signal 
(ATTAAA) was followed by 16 nucleotides before the 
poly(A) start. The estimated open reading frame 
comprises 579 nucleotides coding for 193 amino acid 
residues with a calculated molecular weight of 21,800 
daltons • 

The coding region was further amplified by RT- 
PCR, to eliminate the possibility of the synthetic 
sequence obtained being a cDNA chimera. 
(2) Similarity of CKAP2 to other CAPs 

Vfhile sequencing of CKAP2 revealed homology 
with the sequences of restin and CLIP- 170, the homologous 
region was limited to a short sequence corresponding to 
the CAP-GLY domain. On the amino acid level, the deduced 
CKAP2 was highly homologous to five other CAPs in this 
domain . 

CKAP2 was lacking in such other motif 
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charact:erist:ics of some CAPs as the alpha helical rod and 
zinc finger motif. The alpha helical rod is thought to 
contribute to dimerization and to increase the micro- 
tubule binding capacity [Pierre^ P., et al.. Cell, 70 . 
887-900 (1992)]. The lack of the alpha helical domain 
might mean that.CKAP2 be incapable of homo or hetero 
dimer formation. 

Paralleling of the CAP-GLY domains of these 
proteins revealed that other conserved residues other 
than glycine residues are also found in CKAP2. CAPs 
having a CAP-GLY domain are thought to be associated with 
the activities of cellular organelles and the 
interactions thereof with microtubules. Since it 
contains a CAP-GLY domain, as mentioned above, CKAP2 is 
placed in the family of CAPs. 

Studies with mutants of Glued have revealed 
that the Glued product plays an important role in almost 
all cells [Swaroop, A., et al., Proc. Natl. Acad. Sci. 
USA. 84 , 6501-6505 (1987)] and that it has other neuron- 
specific functions in neuronal cells [Meyerowitz, E. M. 
and Kankel, D. R., Dev. Biol., 62, 112-142 (1978)]. 
These microtubule-associated proteins are thought to 
function in vesicle transport and mitosis. Because of 
the importance of the vesicle transport system in 
neuronal cells, defects in these components might lead to 
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aberrant: neuronal syst:ems. 

In view of the above ^ CKAP2 might be involved 
in specific neuronal functions as well as in fundamental 
cellular functions. 
(3) Northern blot analysis 

The expression of human CKAP2 mRNA in normal 
human tissues was examined by Norther*n blotting in the 
same manner as in Example 1 (2) using the 6EN-080G01 
clone (corresponding to nucleotides 553-1015) as a probe* 

As a result, in all the eight tissues tested, 
namely human heart, brain, placenta, lung, liver, 
skeletal muscle, kidney and pancreas, a 1.0 kb transcript 
agreeing in size with the CKAP2 cDNA was detected. Said 
1.0 kb transcript was expressed at significantly higher 
levels in heart and brain than in the other tissues 
examined. Two weak bands, 3.4 kb and 4.6 kb, were also 
detected in all the tissues examined. 

According to the Northern blot analysis, the 
3.4 kb and 4.6 kb transcripts might possibly be derived 
from the same gene coding for the 1.0 kb CKAP2 by 
alternative splicing or transcribed from other related 
genes. These characteristics of the transcripts may 
indicate that CKAP2 might also code for a protein having 
a CAP-GLY domain as well as an alpha helix. 
(4) Cosmid cloning and chromosomal localization by 
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direct R-band±ng FISH 

Two cosmlds corresponding to the CKAP2 cDNA 

t- 

were obtained. These two cosmid clones were subjected to 
direct R-banding FISH in the same manner as in Example 1 
(3) for chromosomal locus mapping of CKAP2. 

For suppressing the background due to 
repetitive sequences, a 20- fold excessive amount of human 
Cot- 1 DNA (BRL) was added as described by Lichter et al« 
[Lichter, P., et al*, Proc. Natl. Acad. Sci. USA, 87 . 
6634-6638 (1990)]. A Previa 100 film (Fuji ISO 100; Fuji 
Photo Film) was used for photomicrography. 

As a result, CKAP2 was mapped on chromosome 
bands 19ql3.11-ql3.12. 

Two autosomal dominant neurological diseases 
have been localized to this region by linkage analysis: 
CADASIL (cerebral autosomal dominant arteriopathy with 
subcortical infarcts and leukoencephalopathy) between the 
DNA markers D19S221 and D19S222, and FHM (familial 
hemiplegic migraine) between D19S215 and D19S216. These 
two diseases may be allelic disorders in which the same 
gene is involved [Tournier-Lasserve, E., et al.. Nature 
Genet., 3, 256-259 (1993); Joutel, A., et al.. Nature 
Genet., 5, 40-45 (1993)]. 

Although no evidence is available to support 
CKAP2 as a candidate gene for FHM or CADASIL, it is 
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concelvable that its mutation might lead to some or other 
neurological disease. 

By using the novel human CKAP2 gene of the 
present invention as obtained in this example, it is 
possible to detect the expression of said gene in various 
tissues or produce the human CKAP2 gene in the manner of 
genetic engineering. Through these, it becomes possible 
to analyze the functions of the human CKAP2 system or 
human CKAP2, which is involved in diverse activities 
essential to cells, as mentioned above, to diagnose 
various neurological diseases in which said system or 
gene is involved, for example familial migraine, and to 
screen out and evaluate a therapeutic or prophylactic 
drug therefor. 

Example 3 

OTK27 gene 

(1) OTK27 gene cloning and DNA sequencing 

As a result of sequence analysis of cDNA clones 
arbitraily selected from a human fetal brain cDNA library 
in the same manner as in Example 1 (1) and database 
searching, a cDNA clone, GEN-025F07, coding for a protein 
highly homologous to NHP2, a yeast nucleoprotein 
C Saccha romvces cerevisiae : Kolodrubetz, D. and Burgum, 
A., YEAST, 7, 79-90 (1991)], was found and named OTK27. 

Nucleoproteins are fundamental cellular consti- 
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tuents of chromosomes, ribosomes and so forth and are 
thought to play an essential role in cell multiplication 

r 

and viability. The yeast nucleoprotein NHP2, a high- 
mobility group (HMG)-like protein, like HMG, has 
reportedly a function essential for cell viability 
[Kolodrubetz, D. and Burgum, A., YEAST, 7, 79-90 (1991)]. 

The novel human gene, OTK27 gene, of the 
present invention, which is highly homologous to the 
above-mentioned yeast NHP2 gene, is supposed to be 
similar in function. 

The nucleotide sequence of said 6EN-025F07 
clone was found to comprise 1493 nucleotides, as shown 
under SEQ ID NO: 9, and contain an open reading frame 
comprising 384 nucleotides, as shown under SEQ ID NO: 8, 
coding for an amino acid sequence comprising 128 amino 
acid residues, as shown under SEQ ID NO: 7. The 
initiation codon was located at nucleotides 95-97 of the 
sequence shown under SEQ ID NO: 9, and the termination 
codon at nucleotides 479-481. 

At the amino acid level, the OTK27 protein was 
highly homologous (38%) to NHP2. It was 83% identical 
with the protein deduced from the cDNA from Arabidopsis 
thaliana; Newman, T., unpublished; GENEMBL Accession No. 
T14197). 

(2) Northerti blot analysis 
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For examining the expression of human 0TK27 
mRNA in normal human tissues, "the insert in the OTK27 

r 

cDNA was amplified by PCR, the PGR product was purified 

and labeled with [ P]-dCTP (random-primed DNA labeling 

kit, Boehringer Mannheim), and Northern blotting was 

performed using the labeled product as a probe in the 

same manner as in Example 1 (2). 

As a result of the Northern blot analysis, two 

bands corresponding to possible transcripts from this 

gene were detected at approximately 1.6 kb and 0.7 kb. 

Both sizes of transcript were expressed in all normal 

adult tissues examined* However, the expression of the 

0.7 kb transcript was significantly reduced in brain and 

was of higher levels in heart, skeletal muscle and 

testicle than in other tissues examined. 

For further examination of these two 

•••• 

transcripts, eleven cDNA clones were isolated from a 
testis cDNA library and their DNA sequences were 
determined in the same manner as in Example 1 (1). 

As a result, in six clones, the sequences were 
found to be in agreement with that of the 0.7 kb 
transcript, with a poly(A) sequence starting at around 
the 600th nucleotide, namely at the 598th nucleotide in 
two of the six clones, at the 606th nucleotide in three 
clones, and at the 613th nucleotide in one clone. 
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In 1:hese six clones^ the "TATAAA" sequence was 
recognized at; nucleotides 583-588 as a probable poly(A) 
signal. The upstream poly(A) signal "TATAAA" of this 
gene was recognized as little influencing in brain and 
more effective in the three tissues mentioned above than 
in other tissues. The possibility was considered that 
the stability of each transcript vary from tissue to 
tissue. 

Results of zoo blot analysis indicated that 
this gene is well conserved also in other vertebrates. 
Since this gene is expressed ubiquitously in normal adult 
tissues and conserved among a wide range of species, the 
gene product is likely to play an important physiological 
role. The evidence that yeasts lacking in NHP2 are 
nonviable suggests that the human homolog may also be 
essential to cell viability. 

(3) Chromosomal localization of OTK27 by direct R- 
banding FISH 

One cosmid clone corresponding to the cDNA 
OTK27 was isolated from a total human genomic cosmid 
library (5-genome equivalent) using the 0TK27 cDNA insert 
as a probe and subjected to FISH in the same manner as in 
Example 1 (3) for chromosomal localization of OTK27. 

As a result, two distinct spots were observed 
on the chromosome band 12q24.3. 
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The 0TK27 gene of "the present invention can be 
used in causing expression thereof and detecting the 
OTK27 protein, a human nucleoprotein, and thus can be 
utilized in the diagnosis and pathologic studies of 
various diseases in which said protein is involved and, 
because of its involvement in cell proliferation and 
differentiation, in screening out and evaluating 
therapeutic and preventive drugs for cancer. 

Example 4 

OTK18 gene 

(1) OTK18 gene cloning and DNA sequencing 

Zinc finger proteins are defined as const ituing 
a large family of transcription-regulating proteins in 
eukaryotes and carry evolutionally conserved structural 
motifs [Kadonaga, J. T., et al*. Cell, 51, 1079-1090 
(1987); Klung, A. and Rhodes, D., Trends Biol. Sci., 12, 
464-469 (1987); Evans, R. M. and Hollenberg, S. M., C^ll, 
52, 1-3 (1988)] . 

The zinc finger, a loop-like motif fozrmed by 
the interaction between the zinc ion and two residues, 
cysteine and histidine residues, is involved in the 
sequence- specific binding of a protein to RNA or DNA. 
The zinc finger motif was first identified within the 
amino acid sequence of the Xenopus transcription factor 
IIIA [Miller, J., et al., EMBO J., 4, 1609-1614 (1986)]. 
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The C2H2 finger motif is in general tandemly 
repeated and contains an evolutionally conserved inter- 
vening sequence of 7 or 8 amino acids. This intervening 
stretch was first identified in the Kruppel segmentation 
gene of Drosophila [Rosenberg, U. B., et al.. Nature, 
319/ 336-339 (1986)] • Since then, hundreds of zinc 
finger protein-encoding genes have been found in 
vertebrate genomes* 

As a result of sequence analysis of cDNA clones 
arbitrarily selected from a human fetal brain cDNA 
library in the same manner as in Example 1 (1). and 
database searching, several zinc finger structure- 
containing clones were identified and, further, a clone 
having a zinc finger structure of the Kruppel type was 
found • 

Since this clone lacked the 5* portion of the 
transcript, plaque hybridization was performed with a 
fetal brain cDNA library using, as a probe, an appro- 
ximately 1.8 kb insert in the cDNA clone, whereby three 
clones were isolated. The nucleotide sequences of these 
were determined in the same manner as in Example 1 ( 1 ) . 

Among the three clones, the one having the 
largest insert spans 3,754 nucleotides including an open 
reading frame of 2,133 nucleotides coding for 711 amino 
acids. It was found that said clone contains a novel 
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human gene coding for a peptide highly homologous in the 
zinc finger domain to those encoded by human ZNF41 and 

r 

the DrosoDhila Kruppel gene. This gene was named OTK18 
gene (derived from the clone GEN-076C09), 

The nucleotide sequence of the cDNA clone of 
the OTK18 gene is shown under SEQ ID NO: 12^ the coding 
region-containing nucleotide sequence under SEQ ID NO: 11, 
and the predicted amino acid sequence encoded by said 
OTK18 gene under SEQ ID NO:10* 

It was found that the amino acid sequence of 
OTK18 as deduced from SEQ ID NO: 12 contains 13 finger 
motifs on its carboxy side. 

(2) Comparison with other zinc finger motif -containing 
genes 

Comparison among OTK18, human ZNF41 and the 
Drosophila Kruppel gene revealed that each finger motif 
is for the most part conserved in the consensus sequence 
CXECGKAFXQKSXLX2HQRXH • 

Comparison of the consensus sequence of the 
zinc finger motifs of OTK18 with those of human ZNF41 and 
the Drosophila Kruppel gene revealed that the Kruppel 
type motif is well consezrved in the OTK18-encoded 
protein. However, the sequence similarities were limited 
to zinc finger domains and no significant homologies were 
found with regard to other regions* 
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The zinc finger domain interacts specifically 

with the target DNA, recognizing an about 5 bp sequence 

to thereby bind to the DNA helix [Rhodes, D. and Klug, 

A., Cell, 46, 123-132 (1986)]. 

Based on the idea that, in view of the above, 

the multiple module (tandem repetitions of zinc finger) 

can Interact with long stretches of DNA, It is presumable 

that the target DNA of this gene product containing 13 

repeated zinc finger units would be a DNA fragment with a 

length of approximately 65 bp« 

(3) Northern blot analysis 

Northern blot analysis was performed as 

described in Example 1 (2) for checking normal human 

tissues for expression of the human OTK18 mRNA therein by 

amplifying the insert of the OTK18 cDNA by PGR, purifying 

32 

the PGR product, labeling the same with [ P]-dCTP 
(random-primed DNA labeling kit, Boehringer Mannheim) and 
using an MTN blot with the labeled product as a probe. 

The results of Northeim blot analysis revealed 
that the transcript of OTK18 is approximately 4.3 kb long 
and is expressed ubiquitously in various normal adult 
tissues. However, the expression level in the liver and 
in peripheral blood lymphocytes seemed to be lower than 
in other organs tested. 

(4) Gosmid cloning and chromosomal localization by 
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direct R-banding FISH 

Chromosomal localizai:lon of OTK18 was carried 

r- 

out as described in Example 1 ( 3 ) • 

As a result:, comple1:e 1:win spots were 
identified with 8 samples while 23 samples showed an 
incomplete signal or twin spots on either or both 
homologs. All signals appeared at the ql3.4 band of 
chromosome 19 • No twin spots were observed on any other 
chromosomes • 

The results of FISH thus revealed that this 
gene is localized on chromosomal band 19ql3.4. This 
region is known to contain many DNA segments that 
hybridize with oligonucleotides corresponding to zinc 
finger domains [Hoovers, J. N« , et al«. Genomics, 12, 
254-263 (1992)]. In addition, at least one other gene 
coding for a zinc finger domain has been dL^entified in 
this region [Marine, J.-C, et al.. Genomics, 21, 285-r286 
(1994)] . 

Hence, the chromosome 19qi3 is presumably a 
site of grouping of multiple genes coding for 
transcription-regulating proteins • 

When the novel human OTK18 gene provided by 
this example is used, it becomes possible to detect 
expression of said gene in various tissues and produce 
the human OTK18 protein in the manner of genetic 
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engineering. Through tzhese, it is possible to analyze 
the functions of the human transcription regulating 
protein gene system or human transcription regulating 
proteins, which are deeply involved in diverse activities 
fundamental to cells, as mentioned above, to diagnose 
various diseases with which said gene is associated, for 
example malformation or cancer resulting from a 
developmental or differentiation anomaly, and mental or 
nervous disorder resulting from a developmental anomaly 
in the neirvous system, and further to screen out and 
evaluate therapeutic or prophylactic drugs for these 
diseases • 

Example 5 

Genes encoding human 26S proteasome constituent P42 
protein and P27 protein 

(1) Cloning and DNA sequencing of genes respectively 
encoding human 26S proteasome constituent P42 
protein and P27 protein 

Proteasome, which is a multifunctional 
protease, is an enzyme occurring widely in eukaryotes 
from yeasts to humans and decomposing ubiquitin-binding 
proteins in cells in an energy-dependent manner. 
Structurally, said proteasome is constituted of 20S 
proteasome composed of various constituents with a 
molecular weight of 21 to 31 kilodaltons and a group of 
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PA700 regulatory proteins composed of various 
constituents with a molecular weight of 30 to 112 

r 

kilodaltons and showing a sedimentation coefficient of 
22S and, as a whole, occurs as a macromolecule with a 
molecular weight of about 2 million daltons and a 
s.edimentation coefficient of 26S [Rechsteiner, M. , et 
al., J* Biol* Chem., 268 , 6065-6068 (1993); Yoshimura, 
T., et al., J. Struct. Biol,, 111, 200-211 (1993); 
Tanaka, K. , et al,. New Biologist, 4, 173-187 (1992)]. 

Despite structural and mechanical analyses 
thereof, the whole picture of proteasome is not yet fully 
clear. However, according to studies using yeasts and 
mice in the main, it reportedly has the functions 
mentioned below and its functions are becoming more and 
more elucidated. 

The mechanism of energy-dependent proteolysis 
in cells starts with selection of proteins by ubiquitin 
binding. It is not 20S proteasome but 26S proteasome 
that has ubiquitin-conjugated protein decomposing 
activity which is ATP-dependent [Chu-Ping et al., J. 
Biol. Chem., 269 . 3539-3547 (1994)], Hence, human 26S 
proteasome is considered to be useful in elucidating the 
mechanism of energy-dependent proteolysis. 

Factors involved in the cell cycle regulation 
are generally short in half-life and in many cases they 
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are subject to strict quantitative control. In fact, it 
has been made clear that the oncogene products Mos, Myc, 
Fos and so forth can be decomposed by 26S proteasome In 
an energy- and ublgultin-dependent manner [Ishida, N., et 
al., FEBS. Lett^, 324 , 345-348 (1993); Hershko, A- and 
Ciechanover, A*, Annu. Rev* Biochem., 61 , 761-807 
(1992)1 and the Importance of proteasone In cell cycle 
control Is being recognized. 

Its importance in the immune system has also 
been pointed out. It is suggested that proteasome Is 
positively involved in class I major histocompatible 
complex antigen presentation [Michalek, M. T., et al.^ 
Nature, 363 , 552-554 (1993)] and it is further suggested 
that proteasome may be involved in Alzheimer disease, 
since the phenomena of abnormal accumulation of 
ubiquitin-con j ugated proteins in the brain of patients 
with Alzheimer disease [Kitaguchi, N., et al.. Nature, 
361, 530-532 (1988)]. Because of its diverse functions 
such as those mentioned above, proteasome attracts 
attention from the viewpoint of its utility in the 
diagnosis and treatment of various diseases. 

A main function of 26S proteasome is ubiquitin- 
con j ugated protein decomposing activity. In particular, 
it is known that cell cycle-related gene products such as 
oncogene products and cyclins, typically c-Myc, are 
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degraded via ubiquitin-dependent: pa1:hways« It: has also 
been observed thai: t:he proteasome gene is expressed 

r- 

abnormally in liver cancer cells, renal cancer cells, 
leukemia cells and "the like as compared with normal cells 
[Kanayama, H. , el; al.. Cancer Res*^ 51, 6677-6585 (1991)] 
and that proteasome is abnormally accumulated in ttimor 
cell nuclei. Hence, constituents of proteasome are 
expected to be useful in studying the mechanism of such 
canceration and in the diagnosis or treatment of cancer. 

Also, it is knovm that the expression of 
proteasome is induced by interferon y so on and is 

deeply involved in antigen presentation in cells [Aki, 
M., et al., J. Biochem., 115 , 257-269 (1994)]. Hence, 
constituents of human proteasome are expected to be 
useful in studying the mechanism of antigen presentation 
in the immune system and in developing immunoregulating 
drugs . 

Furthermore, proteasome is considered to be 
deeply associated with ubiguitin abnormally accumulated 
in the brain of patients with Alzheimer disease. Hence, 
it is suggested that constituents of human proteasome 
should be useful in studying the cause of Alzheimer 
disease and in the treatment of said disease. 

In addition to the utilization of expectedly 
multifunctional proteasome as such in the above manner. 
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it is probably possible to produce antibodies using 
constituents of proteasome as antigens and use such 
antibodies in diagnosing various diseases by inununoassay • 
Its utility in this field of diagnosis is thus also a 
focus of interest. 

Meanwhile, a protein having the characteristics 
of human 26S proteasome is disclosed, for example in 
Japanese Unexamined Patent Publication No. 292964/1993 
and rat proteasome constituents are disclosed in Japanese 
Unexamined Patent Publication Nos. 268957/1993 and 
317059/1993. However, no human 26S proteasome 
constituents are known. Therefore, the present inventors 
made. a further search for human 26S proteasome 
constituents and successfully obtained two novel human 
26S proteasome constituents, namely human 26S proteasome 
constituent P42 protein and human S26 proteasome 
constituent P27 protein, and performed cloning and DNA 
sequencing of the corresponding genes in the following 
manner . 

(1) Purification of human 26S proteasome constituents 
P42 protein and P27 protein 

Human proteasome was purified using about 100 g 
of fresh human kidney and following the method of purify- 
ing human proteasome as described in Japanese Unexamined 
Patent Publication No. 292964/1993, namely by column 
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chromatography using BioGel A-1.5 m (5 x 90 cm, Bio-Rad), 
hydroxyapatite (1,5 x 15 cm, Blo-Rad) and Q-Sepharose 
(1.5 X 15 cm, Pharmacia) and glycerol density gradient 
centrif ugat ion . 

The thus -obtained human proteasome was 
subjected to reversed phase high performance liquid 
chromatography (HPLC) using a Hitachi model L6200 HPLC 
system. A Shodex RS Pak D4-613 (0.6 x 15 cm, Showa 
Denko) was used and gradient elution was performed with 
the following two solutions: 

First solution: 0.06% trif luoroacetic acid; 
Second solution: 0.05% trif luoroacetic acid, 70% 
acetonitrile . 

An aliquot of each eluate fraction was 
subjected to 8.5% SDS-polyacrylamide electrophoresis 
under conditions of reduction with dithiothreitol . The 
P42 protein and P27 protein thus detected were isolated 
and purified. 

The purified P42 and P27 proteins were respec- 
tively digested with 1 \xg of trypsin in 0.1 M Tris buffer 
(pH 7.8) containing 2 M urea at 37 for 8 hours and the 
partial peptide fragments obtained were separated by 
reversed phase HPLC and their sequences were determined 
by Edman degradation. The results obtained are as shown 
below in Table 2. 
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Table 2 



Partial protein 


Amino acid sequence 


P42 (1) 


VLNISLW 


(2) 


TLMELLNQ^4DGFDTLHR 


(3) 


AVSDFWSEYXMXA 


(4) 


EVDPLVYNX 


(5) 


HGEIDYEAIVK 


(6) 


LSXGFNGADLRNVXTEAGMFAIXAD 


(7) 


MIMATNRPDTLDPALIiRPGXL 


(8) 


IHIDLPNEQARLDILK 


(9) 


ATN6PRYVWG 


(10) 


EIDGRLK 


(11) 


ALQSVGQ I VGE VLK 


(12) 


ILAGPXTK 


(13) 


XXVIELPLTNPELFQG 


(14) 


WSSSLVDK 


(15) 


ALQDYRK 


(16) 


EHREQLK 


(17) 


KLESKLDYKPVR 


P27 (1) 


LVPTR 


(2) 


AKEEEIEAQIK 


(3) 


ANYEVLESQK 


(4) 


VEDALHQLHAR 


(5) 


DVDLYQVR 


(6) 


QSQGLSPAQAFAK 


(7) 


AGSQSGGSPEASGVTVSDVQE 


(8) 


GLLGXNIIPLQR 
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(2) cDNA library screening, clone isolation and cDNA 
nucleotide sequence determination 

r* 

As mentioned in Example 1 (1), the present 
inventors have a database comprising about 30,000 cDNA 
data as constructed based on large-scale DNA sequencing 
using human fetal brain, arterial blood vessel and 
placenta cDNA libraries. ^ 

Based on the amino acid sequences obtained as 
mentioned above in ( 1 ) , computer searching was performed 
with the FASTA program (search for homology between said 
amino acid sequences and the amino acid sequences 
estimated from the database). As regards P42, a clone 
(GEN-331G07) showing identity with regard to two amino 
acid sequences [(2) and (7) shown in table 2] was 
screened out and, as regards P27, a clone (6EN-163D09) 
showing identity with regard to two amino acid sequences 
[(1) and (8) shown in Table 2] was found. 

For each of these clones, the 5' .side sequence 
was determined by 5' RACE and the whole sequence was 
determined, in the same manner as in Example 2 (2). 

As a result, it was revealed that the above- 
mentioned P42 clone GEN-331G07 comprises a 1,566- 
nucleotide sequence as shown under SEQ ID NO: 15, 
inclusive of a 1, 167-nucleotide open reading frame as 
shown under SEQ ID NO: 14, and that the amino acid 



-52- 

seguence encoded "thereby is the one shown under SEQ ID 
NO: 13 and comprises 389 amino acid residues. 

The resul1:s of computer homology search 
revealed that the P42 protein is significantly homologous 
to the AAA (ATPase associated with a variety of cellular 
activities) protein family (e.g. P45, TBPl, TBP7, S4^ 
MSSl, etc.). It was thus suggested that it is a new 
member of the AAA protein family. 

As for the P27 clone GEN-163D09, it was 
revealed that it comprises a 1, 128-nucleotide sequence as 
shown under SEQ ID NO: 18^ including a 6 69 -nucleotide open 
reading frame as shown under SEQ ID NO: 17 and that the 
amino acid sequence encoded thereby is the one shown 
under SEQ ID NO: 16 and comprises 223 amino acid residues. 

As regards the P27 protein, homology search 
using a computer failed to reveal any homologous gene 
among public databases. Thus, the gene in question is 
presumably a novel gene having an unknown function. 

Originally, the above-mentioned P42 and P27 
gene products were both purified as regulatory subunit 
components of proteasome complex. Therefore, these are 
expected to play an important role in various biological 
functions through proteolysis, for example a role in 
energy supply through decomposition of ATP and, hence, 
they are presumably useful not only in studying the 
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function of human 26S proteasome but also in the 
diagnosis and treatment of various diseases caused by 
lowerlng of said biological functions, among others. 

Example 6 

BNAP gene 

(1) BNAP gene cloning and DNA sequencing 

The nucleosome composed of DNA and hlstone Is a 
fundamental structure constituting chromosomes In 
eukaryotlc cells and Is well conserved over borders among 
species. This structure Is closely associated with the 
processes of replication and transcription of DNA. 
However, the nucleosome formation Is not fully understood 
as yet. Only certain specific factors Involved In 
nucleosome assembly (NAPs) have been Identified. Thus, 
two acidic proteins, nucleoplasmln and Nl, are already 
known to facilitate nucleosome construction 
[Klelnschmldt, J. A*, et al., J. Biol. Chem., 260 . 116.6- 
1176 (1985); Dllworth, S. M., et al.. Cell,, 51, 1009-1018 
(1987)]. 

A yeast gene, NAP- I, was Isolated using a mono- 
clonal antibody and recombinant proteins derived 
therefrom were tested as to whether they have nucleosome 
assembling activity in vivo. 

More recently, a mouse NAP- I gene, which is a 
mammalian homolog of the yeast NAP-I gene was cloned 
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(Okuda, A.; registered in database under the accession 
number D12618), Also cloned were a mouse gene^ DN38 
[Kato, K., Eur, J. Neuroscl.^ 2, 704-711 (1990)] and a 
human nucleosome assembly protein (hNRP) [Simon, H. U., 
5 et al., Biochem. J., 297 . 389-397 (1994)] • It was shown 
that the hNRP gene is expressed in many tissues and is 
associated with T lymphocyte proliferation. 

The present inventors performed sequence 
analysis of cDNA clones arbitrarily chosen from a human 
10 fetal brain cDNA library in the same manner as in Example 
1 (1), followed by searches among databases and, as a 
result, made it clear that a 1, 125-nucleotide cDNA clone 
(free of poly(A)), GEN-078D05, is significantly 
homologous to the mouse NAP-I gene, which is a gene for a 
15 nucleosome assembly protein (NAP) involved in nucleosome 
construction, a mouse partial cDNA clone, DN38, and hNRP. 

Since said clone GEN-078D05 was lacking in the 
5* region, 5' RACE was performed in the same manner as in 
Example 2 (2) to obtain the whole coding region. For 
20 this 5* RACE, primers PI and P2 respectively having the 
nucleotide sequences shown below in Table 3. 



Table 3 



Primer 






Nucleo-tlde sequence 


Primer 


PI 


5 


' -TTGAAGAATGATGCATTAGGAACCAC-3 ' 


Primer 


P2 


5 


• -CACTCGAGTGGCTGGATTTCAATTTCTCCAGTAG-3 ' 
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Af-ter the first: 5' RACE, a single band 
corresponding to a sequence length of 1^300 nucleotides 

r 

was obtained. This product was Inserted Into pT7Blue(R) 
T-Vector and several clones appropriate In Insert size 
were selected. 

Ten 5'- RACE clones obtained from two 
Independent PGR reactions were sequenced and the longest 
clone GEN-078D05TA13 (about 1,300 nucleotides long) was 
further analyzed. 

Both strands of the two overlapping cDNA clones 
GEN-078D05 and GEN-078D05TA13 were sequenced, whereby It 
was confirmed that the two clones did not yet cover the 
whole coding region. Therefore, a further seco.nd 5' RACE 
was carried out. For the second 5' RACE, two primers, P3 
and P4, respectively having the sequences shown below In 
Table 4 were used. ^ 



Table 4 



Primer 






Nucleo'tlde sequence 


Primer 


P3 


5* 


-GTCGAGCTAGCCATCTCCTCTTCG-3 ' 


Primer 


P4 


5' 


-CATGGGCGACAGGTTCCGAGACC-3 ' 



A clone, GEN-078D0508, obtained by the second 
5' RACE was 300 nucleotides long. This clone contained 
an estimable Initiation codon and three preceding In- 
frame termination codons. From these three overlapping 
clones. It became clear that the whole coding region 
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comprises 2,636 nucleotrldes • This gene was named brain- 
specific nucleosome assembly protein (BNAP) gene. 

The BNAP gene contains a 1, 5 18 -nucleotide open 
reading frame shown under SEQ ID NO: 20. The amino acid 
encoded thereby comprises 506 amino acid residues, as 
shown under SEQ. ID NO: 19, and the nucleotide sequence of 
the whole cDNA clone of BNAP is as shown under SEQ ID 
NO:21. 

As shown under SEQ ID NO: 21, the 5* noncoding 
region of said gene was found to be generally rich in GC. 
Candidate initiation codon sequences were found at 
nucleotides Nos. 266-268, 287-289 and 329-331. These 
three sequences all had well conserved sequences in the 
vicinity of the initiation codons [Kozak, M. , J. Biol. 
Chem., 266 r 19867-19870 (1991)]. 

According to the scanning model, the first ATG 
(nucleotides Nos. 266-268) of the cDNA clone may be the 
Initiation codon. The termination codon was located at 
nucleotides Nos. 1784-1786. 

The 3' noncoding redion was generally rich in 
AT and two polyadenylation signals (AATAT^) were located 
at nucleotides Nos. 2606-2611 and 2610-2615, 
respectively . 

The longest open reading frame comprised 1,518 
nucleotides coding for 506 amino acid residues and the 
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calculat:ed molecular weight of the BNAP gene product was 
57^600 daltons* 

Hydrophilic plots indicated that BNAP is very 
hydrophilic, like other NAPs^ 

For recombinant BNAP expression and 
purification and for eliminating the possibility that the 
BNAP gene sequence might give three chimera clones in the 
step of 5* RACE, RT-PCR was performed using a sequence 
comprising nucleotides Nos* 326-356 as a sense primer and 
a sequence comprising nucleotides Nos. 1758-1786 as an 
antisenses primer. 

As a result, a single product of about 1,500 bp 
was obtained and it was thus confirmed that said sequence 
is not a chimera but a single transcript. 
(2) Comparison between BNAP and NAPs 

The amino acid sequence deduced from BNAP 
showed 46% identity and 65% similarity to hNRP. 

The deduced BNAP gene product had motifs 
characteristic of the NAPs already reported and of BNAP. 
In general, half of the C terminus was well conserved in 
humans and yeasts. 

The first motif (domain I ) is KGIPDYWLI (corres 
ponding to amino acid residues Nos. 309-33.7), This was 
observed also in hNRP (KGIPSFWLT) and in yeast NAP-I 
(KGIPEFWLT). 
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The second motif (domain II) is ASFFNFFSPP 
(corresponding to amino acid residues Nos. 437-446) and 
this was expressed as DSFFNFFAPP in hNRP and as ESFFNFFSP 
in yeast NAP- I. 

These two motifs were also conserved in the 
deduced mouse NAP-I and DN38 peptides. Both conserved 
motifs were each a hydrophilic cluster^ and the Cys in 
position 402 was also found conserved. 

Half of the N terminus had no motifs strictly 
conserved from yeasts to mammalian species, while motifs 
conserved among mammalian species were found. 

For instance, HDLERKYA (corresponding to amino 
acid residues Nos. 130 to 137) and IINAEYEPTEEECEW 
(corresponding to amino acid residues Nos. 150-164), 
which may be associated with mammal-specific functions, 
were found strictly conserved. 

NAPs had acidic stretches, which are believed 
to be readily capable of binding to histone or other 
basic proteins. All NAPs had three acidic stretches but 
the locations thereof were not conserved. 

BNAP has no such three acidic stretches but, 
instead, three repeated sequences (corresponding to amino 
acid residues Nos. 194-207, 208-221 and 222-235) with a 
long acidic cluster, inclusive of 41 amino acid residues 
out of 98 amino acid residues, the consensus sequence 
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being ExxKExPEVKxEEK (each x being a nonconserved , mostly 
hydrophobic , residue ) . 

r 

Furthermore^ it was revealed that the BNAP 
sequence had several BNAP-specif ic motifs. Thus, an 
extremely serine-rich doamin (corresponding to amino acid 
residues Nos. 24-72) with 33 (67%) of 49 amino acid 
residues being serine residues was found in the N- 
terminus portion* On the nucleic acid level, they were 
reflected as incomplete repetitions of AGC. 

Following this serine-rich region, there 
appeared a basic domain (corresponding to amino acid 
residues Nos. 71-89) comprising 10 basic amino acid 
residues among 19 residues. 

BNAP is supposed to be localized in the 
nucleus. Two possible signals localized in the nucleus 
were observed (NLSs). The first signal wgis found in the 
basic domain of BNAP and its sequence YRKKR (corres- 
ponding to amino acid residues Nos. 75-79) was similar to 
NLS (GRKKR) of Tat of HIV-1. The second signal was 
located in the C terminus and its sequence KKY^K 
(corresponding to amino acid residues Nos. 502-506) was 
similar to NLS (KKKRK) of the large T antigen of SV40. 
The presence of these two presumable NLSs suggested the 
localization of BNAP in the nucleus. However the 
possibility that other basic clusters might act as NLSs 
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could not be excluded. 

BNAP has several phosphorylation sites and the 
activity of BNAP may be controlled through phosphoryla- 
tion thereof. 

(3) Northern blot analysis 

Northern blot analysis was performed as 

described in Example 1 (2). Thus, the clone 6EN- 

078D05TA13 (corresponding to nucleotides Nos. 323 to 1558 

in the BNAP gene sequence) was amplified by PCR^ the PGR 

32 

product was purified and labeled with [ P]-dCTP (random- 
primed DNA labeling kit, Boehringer Mannheim), and* the 
expression of BNAP mRNA in normal human tissues was 
examined using an MTN blot with the labeled product as a 
probe • 

As a result of Northern blot analysis, a 3,0 kb 
transcript of BNAP was detected (8 -hour exposure) in the 
brain among eight human adult tissues tested, namely 
heart, brain, placenta, lung, liver, skeletal muscle, 
kidney and pancreas and, after longer exposure (24 
hours), a dim band of the same size was detected in the 
heart . 

BNAP was found equally expressed in several 
sites of brain tested whereas, in other tissues, no 
signal was detected at all even after 72 hours of 
exposure. hNRP mRNA was found expressed everywhere in 
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the human tissues tested whereas the expression of BNAP 
mRNA was tissue-specific* 

r 

(4) Radiation hubrid mapping 

Chromosomal mapping of the BNAP clone was 
performed by means of radiation hlbrid mapping [Cox, D, 
R*, et al.. Science, 250, 245-250 (1990)]. 

Thus, a total human genome radiation hybrid 
clone (G3RH) panel was purchased from Research Genetics, 
Inc., AL, USA and PCR was carried out for chromosomal 
mapping analysis according to the product manual using 



two primers. 


Al and A2, respectively having the 


nucleotide sequences shown in Table 5. 




Table 5 


Primer 


Nucleotide sequence 


Al primer 


5 • -CCTAAAAAGTGTCTAAGtGCCAGTT-3 ' 


A2 primer 


5 • -TCAGTGAAAGGGAAGGTAGAACAC-3 ' 



The results obtained were analyzed utilizing 
softwares usable on the Internet [Boehnke, M., et al.. 
Am. J. Hum. Genet., 46. 581-586 (1991)]. 

As a result, the BNAP gene was found strongly 
linked to the marker DXS990 (LOD « 1000, cRSOOO = -0.00). 
Since DXS990 is a marker localized on the chromosome 
Xq21.3-q22, it was established that BNAP is localized to 
the chromosomal locus Xq21.3-q22 where genes Involved in 
several signs or symptoms of X-chromosome- associated 
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mental retardation are localized- 

The nucleosome is not only a fundamental 
chromosomal structural unit characteristic of eukaryotes 
but also a gene expression regulating unit. Several 
results indicate that genes with high transcription 
activity are sensitive to nuclease treatment, suggesting 
that the chromosome structure changes with the 
transcription activity [Elgin, S. R. , j* Biol. Chem. , 
263 , 19259-19262 (1988)]. 

NAP-1 has been cloned in yeast, mouse and human 
and is one of the factors capable of promoting nucleosome 
construction in vivo. In a study performed on their 
sequences, NAPs containing the epitope of the specific 
antibody 4A8 were detected in human, mouse, frog, 
Drosophila and yeast ( Saccharomvces cerevisiae ) [Xshlmi, 
Y., et al., Eur. J. Biochem., 162, 19-24 (1987)]. 

In these experiments, NAPs, upon SDS-PAGE 
analysis, electrophoretically migrated to positions 
corresponding to a molecular weight between 50 and 60 
kDa, whereas the recombinant BNAP slowly migrated to a 
position of about 80 kDa. The epitope of 4A8 was shown 
to be localized in the second, well-conserved, 
hydrophobic motif. And, it was simultaneously shown that 
the triplet FNF is important as a part of the epitope 
[Fujii-Nakata, T. , et al., J. Biol. Chem. , 267 . 20980- 
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20986 (1992)] . 

BNAP also contained this consensus motif in 

t- 

domain II. The fact that domain II is markedly 
hydrophobic and the fact that domain II can be recognized 
by the immune system suggest that it is probably 
presented on the BNAP surface and is possibly involved in 
protein-protein interactions . 

Domain I^ too^ may be involved in protein- 
protein interactions. Considering that these are 
conserved generally among NAPs, though to a relatively 
low extent, it is conceivable that they must be essential 

for nucleosome construction, although the functional 
meaning of the conserved domains is still unknovm . 

The hNRP gene is expressed in thyroid gland, 

stomach, kidney, intestine, leukemia, lung cancer, 

mammary cancer and so on [Simon, , et^al,, Biochem. 

J., 297 , 389-397 (1994)]. Like that, NAPs are expressed 

everywhere and are thought to be playing an important 

role in fundamental nucleosome formation. 

BNAP may be involved in brain- specif ic 

nucleosome formation and an insufficiency thereof may 

cause neurological diseases or mental retardation as a 

result of deviated functions of neurons. 

BNAP was found strongly linked to a marker on 

the X-chromosome q21.3-q22 where sequences involved in 
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several symptoms of X-chromosome-associated mental 
retardation are localized. This center- surrounding 
region o£ X-chromosome was rich In genes responsible for 
a- thalassemia, mental retardation (ATR-X) or some other 
forms of mental retardation [Gibbons, R. J., et al.. 
Cell, 80/ 837-845 (1995)]. Like the analysis of the ATR- 
X gene which seems to regulate the nucleosome structure, 
the present Inventors suppose that BNAP may be Involved 
In a certain type of X-chromosome-llnked mental 
retardation • 

According to this example, the novel BNAP gene 
Is provided and, when said gene Is used. It Is possible 
to detect the expression of said gene In various tissues 
and to produce the BNAP protein by the technology of 
genetic engineering. Through these. It Is possible to 
study the brain nucleosome formation deeply Involved, as 
mentioned above. In variegated activities essential to 
cells as well as the functions of cranial, nerve cells and 
to diagnose various neurological diseases or mental 
retardation In which these are Involved and screen out 
and evaluate drugs for the treatment or prevention of 
such diseases. 

Example 7 

Human skeletal muscle-specif Ic ublqult In-conjugating 
enzyme gene (UBE2G gene) 
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The ubiquitin system is a group of enzymes 
essential for cellular processes and Is conserved from 
yeast to human. Said system is composed of ubiquitin- 
activating enzymes (UBAs), ubiquitin-conjugating enzymes 
(UBCs), ubiquitin protein ligases (UBRs) and 26S 
proteasome particles. 

Ubiquitin is transferred from the above- 
mentioned UBAs to several UBCs^ whereby it is activated. 
UBCs transfer ubiquitins to target proteins with or 
without the participation of UBRs. These ubiquitin- 
conjugated target proteins are said to induce a number of 
cellular responses, such as protein degradation, protein 
modification, protein translocation, DNA repair, cell 
cycle control, transcription control, stress responses, 
etc. and immunological responses [Jentsch, S., et al., 
Biochim. Biophys. Acta, 1089 . 127-139 (19SS;i-); Hershko, A. 
and Ciechanover, A., Annu. Rev. Biochem. , 61, 761-807 
(1992); jentsch, S., Annu. Rev. Genet*, 26, 179-207 
(1992); Ciechanover, A., Cell, 79, 13-21 (1994)]. 

UBCs are key components of this system and seem 
to have distinct substrate specificities and modulate 
different functions. For example, Saccharomvces 
cerevisiae UBC7 is induced by cadmium and involved in 
resistance to cadmium poisoning [Jungmann, J., et al.. 
Nature, 361 . 369-371 (1993)]. Degradation of MAT-a2 is 
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also executed by UBC7 and UBC6 [Chen, P., et al*. Cell, 
74. 357-369 (1993)] • 

The novel gene obtained in this example is 
UBC7-like gene strongly expressed in human skeletal 
muscle. In the following, cloning and and DNA sequencing 
thereof are described. 

(1) Cloning and DNA sequencing of human skeletal muscle- 
specific ubiqui tin-conjugating enzyme gene (UBE2G 
gene) 

Following the same procedure as in Example 1 
(1), cDNA clones were arbitrarily selected from a human 
fetal brain cDNA library and subjected to sequence 
analysis^ and database searches were performed. As a 
result, a cDNA clone, GEN-423A12^ was found to have a 
significantly high level of homology to the genes coding 
for ubiquitin-conjugating enzymes (UBCs) in various 
species • 

Since said GEN-423A12 clone was lacking in the 
5* side, 5' RACE was performed in the same manner as in 
Example 2 (2) to obtain an entire coding region. 

For said 5' RACE, two primers, PI and P2, 
respectively having the nucleotide sequences shown in 
Table 6 were used. 
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Table 6 



Primer 




Nucleot:ide sequence 


PI primer 


5 




' -TAATGAATTTCATTTTAGGAGGTCGG-3 ' 


P2 primer 


5 


• -ATCTTTTGG6AAAGTAAGATGAGCC-3 ' 



The 5* RACE product: was insert:ed int:o 
pT7Blue(R) T-Vector and clones with an insert: proper in 
size were selected. 

Four of the 5* RACE clones obtained from two 
independent PCR reactions contained the same sequence but 
were different in length. 

By sequencing the above clones, the coding 
sequence and adjacent 5"- and 3 '-flanking sequences of 
the novel gene were determined. 

As a result, it was revealed that the novel 
gene has a total length of 617 nucleotides. This gene 
was named human skeletal muscle-specific ubiquitin- 
con j ugating enzyme gene ( UBE26 gene ) • 

To exclude the conceivable possibility that 
this sequence was a chimera clone, RT-PCR was performed 
in the same manner as in Example 6 (1) using the sense 
primer to amplify said sequence from the human fetal 
brain cDNA library. As a result, a single PCR product 
was obtained, whereby it was confirmed that said sequence 
is not a chimera one. 

The UBE2G gene contains an open reading frame 
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of 510 nucleotides, which is shown under SEQ ID NO: 23, 
the amino acid sequence encoded thereby comprises 170 
amino acid residues, as shown under SEQ ID NO: 22, and the 
nucleotide sequence of the entire UBE2G cDNA is as shown 
under SEQ ID NO: 24. 

As shown under SEQ ID NO: 24, the estimable 
initiation codon was located at nucleotides Nos. 19-21, 
corresponding to the first ATG triplet of the cDNA clone. 
Since no preceding in-frame termination codon was found, 
it was deduced that this clone contains the entire open 
reading frame on the following grounds. 

Thus, (a) the amino acid sequence is highly 
homologous to S. cerevisiae UBC7 and said initiation 
codon agrees with that of yeast UBC7, supporting said ATG 
as such. (b) The sequence A66AT6A is similar to the 
consensus sequence (A/G)CCATGG around the initiation 
codon [Kozak, M. , J. Biol. Chem., 266 . 19867-19870 
(1991)]. 

(2) Comparison in amino acid sequence between UBE2G and 
UBCs 

Comparison in amino acid sequence between UBE2G 
and UBCs suggested that the active site cystein capable 
of binding to ubiquitin should be the 90th residue 
cystein. The peptides encoded by these genes seem to 
belong to the same family. 
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(3) Northern blot analysis 

Northern blot analysis was carried out as des- 

r 

cribed in Example 1 (2). Thus^ the entire sequence of 
UBE2G was amplified by PCR^ the PGR product was purified 
and labeled with P]-dCTP (random-primed DNA labeling 
kit, Boehringer. Mannheim) and the expression of UBE2G 
mRNA in noirmal human tissues using the labeled product as 
a probe. The membrane used was an MTN blot. 

As a result of the Northern blot analysis^ 4.4 
kb, 2.4 kb and 1.6 kb transcripts could be detected in 
all 16 human adult tissues, namely heart, brain, 
placenta, lung, liver, skeletal muscle, kidney, pancreas, 
spleen, thyroid gland, urinary bladder, testis, ovary, 
small intestine, large intestine and peripheral blood 
leukocye, after 18 hours of exposure. Strong expression 
of these transcripts was observed in skeletal muscle. 
(4) Radiation hybrid mapping 

Chromosomal mapping of the UBE26 ,Qlone was per- 
formed by radiation hybrid mapping in the same manner as 
in Example 6 ( 4 ) . 

The primers CI and C4 used in PCR for 
chromosomal mapping analysis respectively correspond to 
nucleotides Nos. 415-435 and nucleotides Nos. 509-528 in 
the sequence shown under SEQ ID NO: 24 and their 
nucleotide sequences are as shown below in Table 7. 
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Table 7 


Primer 


Nucleot:ide sequence 


CI primer 


5 • -GGAGACTCACCTGCTAATGTT-3 ' 


C4 primer 


5 • -CTCAAAAGCAGTCTCTTGGC-3 * 



As a result:, the UBE2G gene was found linked to 
the markers D1S446 (LOD = 12.52, cRSOOO «= 8 •60) and 
D1S235 (LOD = 9.14, cRSOOO = 22.46). These markers are 
localized to the chromosome bands Iq42.13-q42.3. 

UBE2G was expressed strongly in skeletal muscle 
and very weakly in all other tissues examined. All other 
UBCs are involved in essential cellular functions, such 
as cell cycle control, and those UBCs are expressed 
ubiquitously. However, the expression pattern of UBE2G 
might suggest a muscle-specific role thereof. 

While the three transcripts differing in size 
were detected, attempts failed to identify which 
corresponds to the cDNA clone. The primary structure of 
the UBE2G product showed an extreme homology to yeast 
UBC7. On the other hand, nematode UBC7 showed strong 
homology to yeast UBC7, It is involved in degradation of 
the repressor and further confers resistance to cadmium 
in yeasts. The similarities among these proteins suggest 
that they belong to the same family. 

It is speculated that UBE2G is involved in 
degradation of muscle- specific proteins and that a defect 
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in said gene could lead to such diseases as muscular 
dystrophy. Recently^ another proteolytic enzyme, calpain 
3, was found to be responsible for limb-girdle muscular 
dystrophy type 2A [Richard, I., et al.. Cell, 81, 27-40 
(1995)]. At the present, the chromosomal location of 
UBE26 suggests no significant relationship with any 
hereditary muscular disease but it is likely that a 
relation to the gene will be unearthed by linkage 
analysis in future* 

In accordance with this example, the novel 
UBE2G gene is provided and the use of said gene enables 
detection of its expression in various tissues and 
production of the UBE2G protein by the technology of 
genetic engineering. Through these, it becomes possible 
to study the degradation of muscle- specific proteins 
deeply involved in basic activities variegjgted and 
essential to cells, as mentioned above, and the functions 
of skeletal muscle, to diagnose various muscular diseases 
in which these are involved and further to screen out and 
evaluate drugs for the treatment and prevention of such 
diseases. 

Example 8 

TMP-2 gene 

( 1 ) TMP-2 gene cloning and DNA sequencing 

Following the procedure of Example 1 ( 1 ) , cDNA 
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clones were arbitrarily selected from a human fetal brain 
cDNA library and subjected to sequence analysis, and 
database searches were performed. As a result, a clone 
(GEN-092E10) having a cDNA sequence highly homologous to 
a transmembrane protein gene (accession No.: U19878) was 
found out. 

Membrane protein genes have so far been cloned 
in frog ( Xenopus laevis) and human. These are considered 
to be a gene for a transmembrane type protein having a 
follistatin module and an epidermal growth factor (EOF) 
domain (accession No.: U19878). 

The sequence information of the above protein 
gene indicated that the GEN-092E10 clone was lacking in 
the 5' region, so that the XgtlO cDNA library (human 
fetal brain 5 '-STRETCH PLUS cDNA; Clontech) was screened 
using the 6EN-092E10 clone as a probe, whereby a cDNA 
clone containing a further 5' upstream region was 
isolated. 

Both strands of this cDNA clone were sequenced, 
whereby the sequence covering the entire coding region 
became clear. This gene was named TMP-2 gene. 

The TMP-2 gene was found to contain an open 
reading frame of 1,122 nucleotides, as shown under SEQ ID 
NO: 26, encoding an amino acid sequence of 374 residues, 
as shown under SEQ ID NO: 25. The nucleotide sequence of 
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the entire TMP-2 cDNA clone comprises 1,721 nucleotides, 
as shown under SEQ ID NO: 27. 

As shown under SEQ ID NO: 27, the 5* noncoding 
region was generally rich in GC. Several candidates for 
the initiation codon were found but, according to the 
scanning model, the 5th ATG of the cDNA clone (bases Nos. 
368-370) was estimated as the initiation codon. The 
termination codon was located at nucleotides Nos. 1490- 
1492. The polyadenylation signal (AATAAA) was located at 
nucleotides Nos. 1703-1708. The calculated molecular 
weight of the TMP-2 gene product was 41,400 daltons. 

As mentioned above, the transmembrane genes 
have a follistatin module and an EOF domain. These 
motifs were also found conserved in the novel human gene 
of the present invention. 

The TMP-2 gene of the present invention 
presximably plays an important role in cell proliferation 
or intercellular communication, since, on the amino acid 
level, said gene shows homology, across the EGF domain, 
to TGF-a (transforming growth factor-a; Derynck, R. , et 
al.. Cell, 38, 287-297 (1984)], beta-cellulin [Igarashi, 
K. and Folkman, J., Science, 259 . 1604-1507 (1993)], 
hepar in-binding EGF- like growth factor [Higashiyama, S., 
et al.. Science, 251 . 936-939 (1991)] and schwannoma- 
derived growth factor [Kimura, H., et al.. Nature, 348, 
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257-260 (1990)] • 

(2) Northern blot analysis 

Northern blot analysis was carried out as des- 
cribed in Example 1 (2). Thus, the clone GEN-092E10 was 
amplified by PGR, the PGR product was purified and 
labeled with [ .P]-dCTP (random-primed DNA labeling kit, 
Boehringer Mannheim), and the expression of TMP-2 mRNA in 
normal human tissues was examined using an MTN blot with 
the labeled product as a probe. 

As a result, high levels of expression were 
detected in brain and prostate gland. Said TMP-2 gene 
mRNA was about 2 kb in size. 

According to the present invention, the novel 
human TMP-2 gene is provided and the use of said gene 
makes it possible to detect the expression of said gene 
in various tissues or produce the human TMP-2 protein by 
the technology of genetic engineering and, through these, 
it becomes possible to study brain tumor and prostatic 
cancer, which are closely associated with cell 
proliferation or intercellular communication, as 
mentioned above, to diagnose these diseases and to screen 
out and evaluate drugs for the treatment and prevention 
of such diseases. 

Example 9 

Human NPIK gene 
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(1) Human NPIK gene cloning and DNA sequencing 

Following the procedures of Example 1 and 
Example 2, cDNA clones were arbitrarily selected from a 
himan fetal brain cDNA library and subjected to sequence 
analysis^ and database searches were performed. As a 
result, two cDNA clones highly homologous to the gene 
coding for an amino acid sequence conserved in 
phosphatidylinositol 3 and 4 kinases [Kunz, J., et al«. 
Cell, 73, 585-596 (1993)] were obtained. These were 
named GEN-428B12cl and GEN-428B12c2 and the entire 
sequences of these were determined as in the foregoing 
examples • 

As a result, the GEN-428B12cl cDNA clone and 
the GEN-428B12c2 clone were found to have coding 
sequences differing by 12 amino acid residues at the 5* 
terminus, the 6EN-428B12cl cDNA clone being longer by 12 
amino acid residues. 

The GEN-428B12cl cDNA sequence of .the human 
NPIK gene contained an open reading frame of 2,487 
nucleotides, as shown under SEQ ID NO: 32, encoding an 
amino acid sequence comprising 829 amino acid residues, 
as shown under SEQ ID NO: 31. The nucleotide sequence of 
the full-length cDNA clone comprised 3,324 nucleotides as 
shown under SEQ ID NO: 33. 

The estimated initiation codon was located, as 
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shown under SEQ ID NO:33, at nucleotides Nos. 115-117 
corresponding to the second ATG triplet of the cDNA 
clone. The termination codon was located at nucleotides 
Nos. 2602-2604 and the polyadenylation signal (AATAAA) at 
Nos. 3305-3310. 

On the other hand, the GEN-428B12c2 cDNA 
sequence of the human NPIK gene contained an open reading 
frame of 2,451 nucleotides, as shown under SEQ ID NO: 29. 
The amino acid sequence encoded thereby comprised 817 
amino acid residues, as shown under SEQ ID NO: 28. The 
nucleotide sequence of the full-length cDNA clone 
comprised 3,602 nucleotides, as shown under SEQ 1ED NO: 30. 

The estimated initiation codon was located, as 
shown under SEQ ID NO: 30, at nucleotides Nos. 429-431 
corresponding to the 7th ATG triplet of the cDNA clone. 
The termination codon was located at nucleotides Nos. 
2880-2882 and the polyadenylation signal (AATAAA) at Nos. 
3583-3588. 

(2) Northern blot analysis 

Northern blot analysis was carried out as des- 
cribed in Example 1 ( 2 ) . Thus , the entire sequence of 
human NPIK was amplified by PGR, the PGR product was 
purified and labeled with ['^^Pj-dCTP (random-primed DNA 
labeling kit, Boehringer Mannheim), and normal human 
tissues were examined for expression of the human NPIK 
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mRNA using the MTN blot membrane with the labeled product 
as a probe. 

As a result, the expression of the human NPIK 
gene was observed in 16 various human adult tissues 
examined and an about 3.8 kb transcript and an about 5 kb 
one could be detected. 

Using primer A having the nucleotide sequence 
shown below in Table 8 and containing the initiation 
codon of the GEN-428B12c2 cDNA and primer B shown in 
table 8 and containing the termination codon, PGR was 
performed with Human Fetal Brain Marathon-Ready cDNA 
(Clontech) as a template, and the nucleotide sequence of 
the PGR product was determined. 



Table 8 



Primer 






Nucleotide sequence 


Primer 


A 


5 


• -ATGGGAGATAGAGTAGT6GAGG-3 • 


Primer 


B 


5 


• -TCACATGATGCCGTTGGTGAG-3 ' 



As a result, it was found that the human NPIK 
mRNA expressed Included one lacking In nucleotides Nos. 
1060-1104 of the GEN-428B12cl cDNA sequence (SEQ ID 
NO: 33) (amino acids Nos. 316-330 of the amino acid 
sequence under SEQ ID NO: 31) and one lacking in 
nucleotides Nos. 1897-1911 of the GEN-428B12cl cDNA 
sequence (SEQ ID NO:33) (amino acids Nos. 595-599 of the 
amino acid sequence under SEQ ID NO: 31). 
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It was further revealed that polymorphism 
existed in this gene ( 428B12cl • f asta ) ^ as shown below in 
Table 9, in the region of bases Nos, 1941-1966 of the 
GEN-428B12cl cDNA sequence shown under SEQ ID NO: 33^ 
5 whereby a mutant protein was encoded which resulted from 
the mutation of. IQDSCEITT (amino acid residues Nos. 610- 
618 in the amino acid sequence (SEQ ID NO: 31) encoded by 
. GEN-428B12C1 ) into YKILVISA. 

Table 9 

1930 1940 19S0 1959 

TCCATCAAGCCAArACAACArTCTTGT.GAA 
I II II III! II Mil I nil II II IN 

TCCATTTCGCAACAGCAGCGAGTGCCCCTTTCCATCAAGCC-ATACAACATTCTT.CrC— 
1900 1910 1920 1930 1940 1950 

1960 1970 1980 

ATTACGACTGATAGTGGCATG 
III II IMIIIIIIIIIll 

ATTTCGGCTCATAGTGGCATGATTGAACCAGTGGTCAATGCTGTGTCCATCCATCAGGTG 
1960 1970 19S0 1990 2000 2010 

10 (3) Chromosomal mapping of human NPIK gene by FISH 

Chromosomal mapping of the human NPIK gene was 
carried out by FISH as described in Example 1 (3). 

As a result^ it was found that the locus of the 
human NPIK gene is in the chromosomal position lq21.1- 
15 q2l.3. 

The human NPIK gene, a novel human gene, of the 
present invention included two cDNAs differing in the 5* 
region and capable of encoding 829 and 817 amino acid 
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resldues, as ment:ioned above. In view of this and 

further in view of the findings that the raRNA corres- 

1- 

ponding to this gene includes two dele table sites and 
there occurs polymorphism in a specific region corres- 
ponding to amino acid residues Nos. 610-618 of the GEN- 
428B12cl amino acid sequence (SEQ ID NO: 31), whereby a 
mutant protein Is encoded, it is conceivable that human 
NPIK includes species resulting from a certain number of 
combinations, namely human NPIK, deletion-containing 
human NPIK, hiunan NPIK mutant and/or deletion-containing 
human NPIK mutant. 

Recently, several proteins belonging to the 
family including the above-mentioned PIS and 4 kinases 
have protein kinase activity [Dhand, R* , et al«, EMBO J«, 
13, 522-533 (1994); Stack, J. H. and Emr, S- D., J. Biol. 
Chem., 269 . 31552-31562 (1994); Hartley, K. O., et al.. 
Cell, 82, 848-856 (1995)]. 

It was also revealed that a protein belonging 
to this family is involved in DNA repair [Hartley, K. O., 
et al.. Cell, 82, 849-856 (1995)] and is a causative gene 
of ataxia [Savitsky, K. , et al.. Science, 268, 1749-1753 
(1995)]. 

It can be anticipated that the human NPIK gene- 
encoded protein highly homologous to the family of these 
PI kinases is a novel enzyme phosphorylating lipids or 
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proteins • 

According to this example, the novel human NPIK 
gene is provided. The use of said gene makes it possible 
to detect the expression of said gene in various tissues 
5 and manufacture the human NPIK protein by the technology 
of genetic engineering and, through these, it becomes 
possible to study lipid- or protein-phosphrylating 
enzymes such as mentioned above, study DNA repairing, 
study or diagnose diseases in which these are involved, 
10 for example cancer, and screen out and evaluate drugs for 
the treatment or prevention thereof. 

(4) Construction of an expression vector for fusion 
protein 

To subclone the coding region for a human NPIK 
15 gene (GEN-428B12c2 ) , first of all, two primers, CI and 
C2, having the sequences shown below in T^ble 10 were 
fozrmed based on the information on the DNA sequences* 
obtained above in ( 1 ) . 

Table 10 





Primer 




Nucleotide sequence 




Primer 


CI 


5 • -CTCAGATCTATGGGAGATACAGTAGTGGAGC-3 • 


25 


Primer 


C2 


S; -TCGAGATCTTCACATGATGCCGTTGGTGAG-3 ' 



Both of the primers CI and C2 have a Bal ll 
site, and primer C2 is an antisense primer. 

Using these two primers, cDNA derived from 
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human fetal brain mRNA was amplified by PGR to provide a 
product having a length of about 2500 bases. The 
amplified cDNA was precipitated from ethanol and inserted 
into pT7BlueT-Vector (product of Novagen) and subclonlng 
was completed. The entire sequence was determined in the 
same manner as above in Examples. As a result, it was 
revealed that this gene had polymorphism shown above in 
Table 9. 

The above cDNA was cleaved by Bal ll and 
subjected to agarose gel electrophoresis. The cDNA was 
then excised from agarose gel eund collected using 
GENECLEAN II KIT (product of Bio 101). The cDNA was 
inserted into pBlueBacHis2B-Vector (product of 
Invitrogen) at the Bglll cleavage site and subclonlng was 
completed • 

The fusion vector thus obtained had a Bgl ll 
cleavage site and was an expression vector for a fusion 
protein of the contemplated gene product (about 91 kd) 
and 38 amino acids derived from pBlueBacHls2B-Vector and 
containing a polyhistidine region and an epitope 

TM 

recognizing Anti-Xpress antibody (product of 
Invitrogen) . 

(5) Transfection into insect cell Sf-9 

The human NPIK gene was expressed according to 
the Baculovirus expression system. Baculovirus is a 
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cyclic double-stranded insect-pathogenic virus and can 
produce large amounts of inclusion bodies named 
polyhedrins cells of insects. Using Bac-N-Blue™ 

Transfection Kit utilizing this characteristic of 
Baculovirus and developed by Invitrogen, the Baculovirus 
expression was carried out. 

Stated more specifically, 4 pg of pBlueBacHis2B 

containing the region of the human NPIK gene and 1 iig of 
TM 

Bac-N-Blue DNA (product of Invitrogen) were co- 

TM 

transfected into Sf-9 cells in the presence of Insectin 

liposomes ( product of Invitrogen ) . 

Prior to co-transfection, LacZ gene was 

TM 

incorporated into Bac-N-Blue DNA, so that LacZ would be 
expressed only when homologous recombination took place 

TM 

between the Bac-N-Blue DNA and pBlueBacHis2B. Thus 
when the co-transf ected Sf-9 cells were incubated on agjar 
medium, the plaques of the vims expressing the 
contemplated gene were easily detected as blue plagues. 

The blue plaques were excised from each agar 
and suspended in 400 pi of medium to disperse the virus 
thereon. The suspension was subjected to centrifugation 
to give a supernatant containing the virus. Sf-9 cells 
were infected with the virus again to increase the titre 
and to obtain a large amount of infective virus solution. 
(6) Preparation of human NPIK 
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The expression of the cont:empla1:ed human NPIK 
gene was confirmed three days after infection with the 
virus as follows. 

Sf-9 cells were collected and washed with PBS. 
The cells were boiled with a SDS-PAGE loading buffer for 
5 minutes and SDS-PAGE was performed. According to the 
western blot technique using Anti-Xpress as an antibody, 
the contemplated protein was detected at the position of 
its presumed molecular weight. By contrast, in the case 
of control cells uninfected with the virus, no band 
corresponding to human NPIK was observed in the same 
test . 

Stated more specifically, three days after the 
infection of 15 flasks (175-cm^, FALCON) of semi- 
confluent Sf-9 cells, the cells were harvested and washed 
with PBS, followed by resuspension in a buffer (20 mM 
Tris/HCl (pH 7.5), 1 mM EDTA and 1 mM DTT). The 
suspended cells were lysed by 4 time-sonications for 30 
seconds at 4 with 30 seconds intervals. The sonicated 
cells were sub^Jected to centrifugation and the 
supernatant was collected. The protein in the 
supernatant was immunoprecipitated using an Anti-Xpress 
antibody and obtained as a slurry of protein A-Sepharose 
beads. The slurry was boiled with a SDS-PAGE loading 
buffer for 5 minutes. SDS-PAGE was performed for 



identification and quantification of NPIK. The slurry 

itself was subjected to the following assaying. 

(7) Confirmation of PI4 Kinase activity 

NPIK was expected to have the activity of 

incorporation phosphoric acid at the 4-position of the 

inositol ring of phosphatidylinositol (PI), namely, PI4 

Kinase activity. 

PI4 Kinase activity of NPIK was assayed 

according to the method of Takenawa, et al. (Yamakawa, A. 
and Takenawa, T., J. Biol. Chem., 263 , 17555-17560 
(1988)) as shown below. 

First prepared was a mixture of 10 pi of a NPIK 
slurry (20 mM Tris/HCl (pH 7.5), 1 mM EDTA, 1 mM DTT and 
50% protein A beads), 10 pi of a PI solution ( prepared by 
drying 5 mg of a Pl-containing commercial chloroform 
solution in a stream of nitrogen onto a glass tube wall, 
adding 1 ml of 20 mM Tris/HCl (pH 7.5) buffer and forming 
micelles by sonication), 10 pi of an applied* buffer (210 
mM Tris/HCl (pH 7.5), 5 mM EGTA and 100 mM MgCl2) and 10 
pi of distilled water. Thereto was added 10 pi of an ATP 
solution (5 pi of 500 pM ATP, 4.9 pi of distilled water 
and 0.1 pi of y-^^P ATP (6000 Ci/mmol, product of NEN 
Co., Ltd.)). The reaction was started at 30 •C and 
continued for 2, 5, 10 and 20 minutes. The time 10 
minutes was set as incubation time because a straight- 
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line Increase was observed around 10 minutes in 

incorpora-tion of phosphoric acid into PI in the .assaying 

r 

process described below. 

After completion of the reaction^ PI was 
fractionated by the solvent extraction method and finally 
re-suspended in -chloroform. The suspension was developed 
by thin layer chromatography (TLC) and the radioactivity 
of the reaction product at the Pl4P-position was assayed 
using an analyzer (trade name: Bio-Image; product of Fuji 
Photo Film Co., Ltd.). 

Fig. 1 shows the results. Fig. 1 is an 
analytical diagram of the results of assaying the 
radioactivity based on TLC as mentioned above. The right 
lane (2) is the fraction of Sf-9 cell cytoplasm infected 
with the NPIK-containing virus, whereas the left lane (1) 
is the fraction of uninfected Sf-9 cell cytoplasm. 

Also, predetermined amounts of Triton X-100 and 
adenosine were added to the above reaction system to 
check how such addition would affect the PI4 Kinase 
activity. The PI4 Kinase activity was assayed in the 
same manner as above. 

Fig. 2 shows the results. The results 
confirmed that NPIK had a typical PI4 Kinaze activity 
accelarated by Triton X-100 and inhibited by adenosine. 
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Example 10 

nel-related protein type 1 (NRPl) gene and nel-related 
protein type 2 (NRP2) gene 

(1) Cloning and DNA sequencing of NRPl gene and NRP2 
gene 

EGF-like repeats have been found in many 
membrane proteins and in proteins related to growth 
regulation and differentiation. This motif seems to be 
involved in protein-protein interactions. 

Recently, a gene encoding nel, a novel peptide 
containing five EGF-like repeats, was cloned from a chick 
embryonic cDNA library [Matsuhashi, S., et al.. Dev. 
Dynamics, 203 , 212-222 (1995)]. This product is 
considered to be a transmembrane molecule with its EGF- 
like repeats in the extracellular domain. A 4.5 kb 
transcript (nel mRNA) is expressed in various tissues at 
the embryonic stage and exclusively in brain and retina 
after hatching. . • 

Following the procedure of Example 1 (1), cDNA 
clones were randomly selected from a human fetal brain 
cDNA library and subjected to sequence analysis, followed 
by database searching. As a result, two cDNA clones with 
significantly high homology to the above-mentioned nel 
were found and named GEN-073E07 and GEN-093E05, 
respectively . 
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Slnce both clones were lacking in the 5' 
portion^ 5 ' RACE was performed in the same manner as in 
Example 2 (2) to obtain the entire coding regions. 

As for the primers for 5' RACE, primers having 
an arbitrary sequence obtained from the cDNA sequences of 
the above clones were synthesized while the anchor primer 
attached to a commercial kit was used as such. 

5' RACE clones obtained from the PGR were 
sequenced and the sequences seemingly covering the entire 
coding regions of both genes were obtained. These genes 
were respectively named nel-related protein type 1 (NRPl) 
gene and nel-related protein type 2 (NRP2) gene. 

The NRPl gene contains an open reading frame of 
2,430 nucleotides, as shown under SEQ ID NO: 35, the amino 
acid sequence deduced therefrom comprises 810 amino acid 
residues, as shown under SEQ ID NO: 34, and the nucleotide 
sequence of the entire cDNA clone of said NRPl gene 
comprises 2,977 nucleotides, as shown under SEQ ID NO: 36. 

On the other hand, the NRP2 gene contains an 
open reading frame of 2,448 nucleotides, as shown under 
SEQ ID NO: 38^ the amino acid sequence deduced therefrom 
comprises 816 amino acid residues, as shown under SEQ ID' 
NO: 37, and the nucleotide sequence of the entire cDNA 
clone of said NRP2 gene comprises 3,198 nucleotides, as 
shown under SEQ ID NO: 39. 
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Furthermore^ -the coding regions were amplified 
by RT-PCR to exclude the possibility that either of the 
sequences obtained was a chimeric cDNA. 

The deduced NRPl and NRP2 gene products both 
showed highly hydrophobic N termini capable of func- 
tioning as signal peptides for membrane insertion. As 
compared with chick embryonic nel, they both appeared to 
have no hydrophobic transmembrane domain. Comparison 
among NRPl, NRP2 and nel with respect to the deduced 
peptide sequences revealed that NRP2 has 80% homology on 
the amino acid level and is more closely related to nel 
than NRPl having 50% homology. The cysteine residues in 
cysteine-rich domains and EGF-like repeats were found 
completely conserved. 

The most remarkable difference between the NRPs 
and the chick protein was that the human homologs lack 
the putative transmembrane domain of nel. However, even 
in this lacking region, the nucleotide sequences of NRPs 
were very similar to that of nel. Furthermore, the two 
NRPs each possessed six EGF-like repeats, whereas nel has 
only five. 

Other unique motifs of nel as reported by 
Matsuhashi et al. [Matsuhashi, S., et al.. Dev. Dynamics^ 
203 . 212-222 (1995)] were also found in the NRPs at 
equivalent positions. Since as mentioned above, it was 
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shown "that: t:he t:wo deduced NRP pep1:ldes are not 
"transmembrane proteins^ the NRPs might be secretory 
proteins or proteins anchored to membranes as a result of 
posttranslational modification. 

The present inventors speculate that NRPs might 
function as llgands by stimulating other molecules such 
as EGF receptors. The present inventors further found 
that an extra EGF- like repeat could be encoded in nel 
upon frame shifting of the membrane domain region of nel. 

When paralleled and compared with NRP2 and nel, 
the frame-shifted amino acid sequence showed similarities 
over the whole range of NRP2 and of nel, suggesting that 
NRP2 might be a human counterpart of nel. In contrast, 
NRPl is considered to be not a human counterpart of nel 
but a homologous gene. 
(2) Northern blot analysis 

Northern blot analysis was carried out as des- 
cribed in Example -1 ( 2 ) . Thus , the entire sequences of 

both clones cDNAs were amplified by PGR, the PGR products 

32 

were purified and labeled with [ P]-dCTP (random-primed 
DNA labeling kit, Boehringer Mannheim) and human normal 
tissues were examined for NRP mRNA expression using an 
MTN blot with the labeled products as two probes. 

Sixteen adult tissues and four human fetal 
tissues were examined for the expression pattern of two 
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NRPs. 

As a result of the Northern blot analysis^ it 
was found that a 3*5 kb transcript of NRPl was weakly 
expressed in fetal and adult brain and kidney. A 3.6 kb 
transcript of NRP2 was strongly expressed in adult and 
fetal brain alone, with weak expression thereof in fetal 
kidney as well. 

This suggests that NRPs might play a brain- 
specific role, for example as signal molecules for growth 
regulation. In addition, these genes might have a 
particular function in kidney. 

(3) Chromosomal mapping of NRPl gene and NRP2 gene by 
FISH 

Chromosomal mapping of the NRPl gene and NRP2 
gene was performed by FISH as described in Example 1 (3). 

As a result, it was revealed thai: the 
chromosomal locus of the NRPl gene is localized to 
Ilpl5.1-pl5.2 and the chromosomal locus of -the NRP2 gene 
to 12ql3.11-ql3.12. 

According to the present invention, the novel 
human NRPl gene and NRP2 gene are provided and the use of 
said genes makes it possible to detect the expression of 
said genes in various tissues and produce the human NRPl 
and NRP2 proteins by the technology of genetic 
engineering. They can further be used in the study of 
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the brain neurotransmission system, diagnosis of various 
diseases related to neurotransmission in the brain, and 
the screening and evaluation of drugs for the treatment 
and prevention of such diseases. Furthermore, the 
possibility is suggested that these EGF domain-containing 
NRPs act as growth factors in brain, hence they may be 
useful in the diagnosis and treatment of various kinds of 
intracerebral tumor and effective in nerve regeneration 
in cases of degenerative nervous diseases. 

Example 11 
GSPTl-related protein (GSPTl-TK) gene 
(1) GSPTl-TK gene cloning and DNA sequencing 

The human GSPTl gene is one of the human 
homologous genes of the yeast GSTl gene that encodes the 
GTP-binding protein essential for the Gl to S phase 
transition in the cell cycle. The yeast GSTl gene, first 
identified as a protein capable of complementing a 
temperature-sensitive gstl (Gl-to-S transition) mutant of 
Saccharomvces cerevisiae . was isolated from a yeast 
genomic library [Kikuchi, Y., Shimatake, H. and Kikuchi, 
A., EMBO J., 7, 1175-1182 (1988)] and encoded a protein 
with a target site of cAMP- dependent protein kinases and 
a GTPase domain. 

The human GSPTl gene was isolated from a KB 
cell cDNA library by hybridization using the yeast GSTl 
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gene as a probe [Hoshino, S., Miyazawa, H., Enomoto, T. , 
Hanaoka^ F.^ Kikuchi, Y. ^ Kikuchi^ A* and Ui^ M.^ EMBO 
J*. 8, 3807-3814 (1989)]. The deduced protein of said 
GSPTl gene, like yeast: GSTl, has a GTP-binding domain and 
a GTPase activity center, and plays an important role in 
cell proliferation. 

Furthermore, a breakpoint for chromosome re- 
arrangement has been observed in the GSPTl gene located 
in the chromosomal locus 16pl3.3 in patients with acute 
nonlymphocyt ic leukemia ( ANLL ) [ Ozawa , K . , Murakami , Y . , 
Eki, T., Yokoyama, K. Soeda, E.^ Hoshino, S. Ui, M. and 
Hanaoka, F., Somatic Cell and Molecular Genet., 18, 189- 
194 (1992)] . 

cDNA clones were randomly selected from a human 
fetal brain cDNA library and subjected to sequence 
analysis as described in Example 1 (1) and database 
searching was performed and, as a result, a clone having 
a 0.3 kb cDNA sequence highly homologous to. the above- 
mentioned GSPTl gene was found and named GEN-077A09. The 
GEN-077A09 clone seemed to be lacking in the 5* region, 
so that 5 ' RACE was carried out in the same manner as in 
Example 2 (2) to obtain the entire coding region. 

The primers used for the 5 ' RACE were PI and P2 
primers respectively having the nucleotide sequences 
shown in Table 11 as designed based on the known cDNA 
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sequence of the above-mentioned cDNA, and the anchor 
primer used was the one attached to the commercial kit. 
Thirty five cycles of PGR were performed under the 
following conditions: 94 •C for 45 seconds, 58 for 45 
seconds and 72**C for 2 minutes. Finally, elongation 
reaction was carried out at 72**C for 7 minutes. 



Table 11 



Primer 


Nucleotide sequence 


PI primer 


5 ' -GATTTGTGCTCAATAATCACTATCTGAA-3 ' 


P2 primer 


5 • -GGTTACTAGGATCACAAAGTATGAATTCTGGAA-3 ' 



Several of the 5' RACE clones obtained from the 
above PGR were sequenced and the base sequence of that 
cDNA clone showing overlapping between the 5* RAGE clones 
and the GEN-077A09 clone was determined to thereby reveal 
the sequence regarded as covering the entire coding 
region. This was named GSPTl -related protein "GSPTl-TK 
gene'* . 

The GSPTl -TK gene was found to contain an open 
reading frame of 1,497 nucleotides, as shown under SEQ ID 
NO: 41. The amino acid sequence deduced therefrom 
contained 499 amino acid residues, as shown under SEQ ID 
NO: 40. 

The nucleotide sequence of the whole cDNA clone 
of the GSPTl-TK gene was found to comprise 2,057 
nucleotides, as shown under SEQ ID NO: 42, and the 
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molecular weight was calculated at 55,740 daltons. 

The first methionine code (ATG) in the open 
reading frame had no in- frame termination codon but this 
ATG was surrounded by a sequence similar to the Kozak 
consensus sequence for translational initiation. 
Therefore, it was concluded that this ATG triplet 
occurring in positions 144-146 of the relevant sequence 
is the initiation codon. 

Furthermore, a polyadenylation signal, AATAAA, 
was observed 13 nucleotides upstream from the 
polyadenylation site. 

Human GSPTl-TK contains a glutamic acid rich 
region near the N terminus, and 18 of 20 glutamic acid 
residues occurring in this region of human GSPTl-TK are 
conserved and align perfectly with those of the human 
GSPTl protein. Several regions (Gl, G2, G3, G4 and G5) 
of GTP-binding proteins that are responsible for guanine 
nucleotide binding and hydrolysis were found conserved in 
the GSPTl-TK protein Just as in the human GSPTl protein. 

Thus, the DNA sequence of human GSPTl-TK was 
found 89.4% identical, and the amino acid sequence 
deduced therefrom 92.4% identical, with the corresponding 
sequence, of human GSPTl which supposedly plays an 
important role in the Gl to S phase transition in the 
cell cycle. Said amino acid sequence showed 50.8% 
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identity with that of yeast GSTl. 
(2) Northern blot analysis 

Northern blot analysis was carried out as des- 
cribed in Example 1 (2). Thus, the GEN-077A09 cDNA clone 

was amplified by PGR, the PGR product was purified and 
32 

labeled with [ .P]-dCTP (random-primed DNA labeling kit, 
Boehringer Mannheim), and normal human tissues were 
examined for the expression of GSPTl-TK mRNA therein 
using an MTN blot with the labeled product as a probe. 

As a result of the Northern blot analysis, a 
2,7 kb major transcript was detected in various tissues. 
The level of human GSPTl-TK expression seemed highest in 
brain and in testis. 

(3) Chromosome mapping of GSPTl-TK gene by FISH 

Chromosome mapping of the GSPTl-TK gene was 
performed by FISH as described in Example 1 (3). 

As a result, it was found that the GSPTl-TK 
gene is localized at the chromosomal locus 19pl3.3. In 
this chromosomal localization site, reciprocal location 
has been observed very frequently in cases of acute 
lymphocytic leukemia (ALL) and acute myeloid leukemia 
(AML). In addition, it is reported that acute non- 
lymphocytic leukemia (ANLL) is associated with re- 
arrangements involving the human GSPTl region [Ozawa, K., 
Murakami, Y. , Eki, T., Yokoyama, K., Soeda, E., Hoshino, 
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S,, Ui^ M. and Hanaoka, F., Somatic Cell and Molecular 
Genet 18, 189-194 (1992)]. 

In view of the above, it is suggested that this 
gene is the best candidate gene associated with ALL and 
AML. 

In accordance with the present invention, the 
novel human GSPTl-TK gene is provided and the use of said 
gene makes it possible to detect the expression of said 
gene in various tissues and produce the human GSPTl-TK 
protein by the technology of genetic engineering. These 
can be used in the studies of cell proliferation, as 
mentioned above, and further make it possible to diagnose 
various diseases associated with the chromosomal locus of 
this gene, for example acute myelocytic leukemia. This 
is because translocation of this gene may result in 
decomposition of the GSPTl-TK gene and further some or 
other fused protein expressed upon said translocation may 
cause such diseases. . . 

Furthermore, it is expected that diagnosis and 
treatment of said diseases can be made possible by 
producing antibodies to such fused protein, revealing the 
intracellular localization of said protein and examining 
its expression specific to said diseases. Therefore, it 
is also expected that the use of the gene of the present 
invention makes it possible to screen out and evaluate 
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drugs for -the -treatment: and prevention of said diseases. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Tsutomu, FUJIWARA 

Takeshi, WATANABE 
Masato, HORIE 
Toyomasa, KATAGIRI 

(±1) TITLE OF INVENTION: HUMAN GENE 

(111) NUMBER OF SEQUENCES: 42 

(iv) COEIRESPONDENCE /JDDRESS: 

(A) ADDRESSEE: Sughrue, Mlon, Zlnn, Macpeak & Seas 

(B) STREET: 2100 Pennsylvania Avenue, N.W. 

(C) CITY: Washington 

(D) STATE: D.C. 

(E) COUNTRY: United States 

(F) ZIP: 20037-3202 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC. compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.30 

(vl) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(Ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (202) 293-7060 

(B) TELEFAX: (202) 293-7860 

(C) TELEX: 6491103 



(2) INFORMATION FOR SEQ ID N0:1: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 122 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



Me-c Glu Leu Gly Glu Asp Gly Ser Val Tyr Lys Ser lie Leu Val Thar 
1 5 10 15 

Ser Gin Asp Lys Ala Pro Ser Val lie Ser Arg Val Leu Lys Lys Asn 
20 25 30 

Asn Arg Asp Ser Ala Val Ala Ser Glu Tyr Glu Leu Val Gin Leu Leu 
35 -40 45 

Pro Gly Glu Arg Glu Leu Thr lie Pro Ala Ser Ala Asn Val Phe Tyr 
50 55 60 

Pro Met Asp Gly Ala Ser His Asp Phe Leu Leu Arg Gin Arg Arg Arg 
65 70 75 80 

Ser Ser Thr Ala Thr Pro Gly Val Thr Ser Gly Pro Ser Ala Ser Gly 

85 90 95 

Thr Pro Pro Ser Glu Gly Gly Gly Gly Ser Phe Pro Arg lie Lys Ala 
100 105 110 

Thr Gly Arg Lys lie Ala Arg Ala Leu Phe 
115 120 



(2) INFORMATION FOR SEQ ID NO: 2: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 366 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA(cDNA) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



ATGGAGTTGG GGGAAGATQG CAGTGTCTAT AAGAGCATTT TGGTGACAAG GCAGGACAAG 60 

GCTCCAAGTG TCATCAGTOG TGTOCTTAAG AAAAACAATC GTGACTCTGC AGTGGCTTCA 120 

GAGTATGAGC TGGTACAGCT GCTAOCAGGG GAGOGAGAGC TGACTATOOC AGOCTGGGCT 180 

AATGTATTCr AOOOCATGGA TGGAGCTTCA CAOGATTTOC TOCTGOGGCA GOGGOGAAGG 240 

TCJCTCTACTC CTACAOCTGG OGTCAOCAGT GGOOOGTCTG OCTCAGGAAC TOCTOOGAGT 300 
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GAGGGAGGAG GGGGCTCCTT TOCX»GGATC AAGGOCACAG GGAGGAAGAT TGCAOGGGCA 360 
CTGTTC 366 

(2) INFORMATION FOR SEQ ID N0:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 842 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA( genomic ) 
(ill) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Human fetal brain cDNA library 

(B) CLONE: GEN-501D08 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 28.. 393 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CCCAOGAOCC GTATCATOOG AGTCCAG ATG GAG TTG GG6 GAA GAT SGC ACT 51 

Mai: Glu Leu Gly Glu Asp Gly Ser 
1 5 

GTC TAT AAG AGC ATT TTG GTG ACA AGC CAG GAC AAG OCT OCA ACT GTC 99 
Val Tyr Lys Ser lie Leu Val Thr Ser Gin Asp Lys Ala Pro Ser Val 
10 15 20 

ATC ACT OCT GTC CTT AAG AAA AAC AAT OCT GAC TCT GCA GTG GCT TCA 147 
lie Ser Arg Val Leu Lys Lys Asn Asn Arg Asp Ser Ala Val Ala Ser 
25 30 35 40 

GAG TAT GAG CTG CTA CAG CTG CTA OCA GGG GAG OGA GAG CTG ACT ATC 195 
Glu Tyr Glu Leu Val Gin Leu Leu Pro Gly Glu Arg Glu Leu Thr lie 

45 50 55 

OCA GOC TOG GCT AAT CTA TTC TAC OOC ATG GAT GGA GCT TCA CAC GAT 243 
Pro Ala Ser Ala Asn Val Rie Tyr Pro Met Asp Gly Ala Ser His Asp 
60 65 70 
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TTC CTC CTG CGG CAG OGG OGA AGG TOC TOT ACT GOT ACA OCT GGC GTC 291 
Phe Leu Leu Arg Gin Arg Arg Arg Ser Ser Thr Ala Thr Pro Gly Val 
75 80 85 

t- 

AOC AGT GGC OOG TCT GOC TCA GGA ACT OCT COG AGT GAG GGA GGA GGG 339 
Hir Ser Gly Pro Ser Ala Ser Gly Thr Pro Pro Ser Glu Gly Gly Gly 
90 95 100 

GGC TOC TTT 000 AGG ATC AAG GOO ACA GGG AGG AAG ATT GCA OGG GCA 387 
Gly Ser Fhe Pro Axg lie Lys Ala Htxc GLy Arg Lys He Ala Arg Ala 
105 110 • 115 120 

CTG TTC TGAGGAGGAA GOOOCTTTTT TTACAGAACT CAiFGOTGTTC ATAOCAGATG 443 
Leu Fhe 

TOOGTAOOCA TOCTGAATOG TGGCAATTAT AITCACATTGA GACAGAAATT CAGAAAGGGA 503 

GCTAGCCAOC CTOGGGCAGT GAAGTOOCAC TGGTTTAOCA GACAGCTGAG AAATOCAOCC 563 

CPGTOQGAAC TGGTCTCTTA TAAOCAAGTT GGATAOCTGT GTATAGCTTG OCAOCTTOCA 623 

TGAOTGCAOC ACACAOGTAG TGCTGGAAAA AOOCATCAGT TTCT6ATTCT TGGOCATATO 683 

CTAACATOCA AOOGOCAAOC AAAOOCTTCA AOOCTCTGAG OOOCAGGOCA GAOOGGAATG 743 

GCAAAATGTA GGTOCTGGCA GGAGCTCTTC TTOOCACTCT GGGGGTTTCr ATCACTGTGA 803 

CAACACTAAG ATAATAAAOC AAAACACTAC CTGAATTCT 842 



(2) INFCS^TION FOR SBQ ID N0:4: 

(±) SEQUENCE CHARACTEEIISTICS: 

(A) LENGTH: 193 amino acdbds 

(B) lYPEz amino add 
(D) TOPOLOGY: Unpar 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTI(»I: SBQ ID N0:4: 



Met Glu Leu Glu Leu Tyr Gly Val Asp Asp Lys Phe Tyr Ser Lys Leu 
15 10 15 

Asp Gin Glu Asp Ala Leu Leu Gly Ser Tyr Pro Val Asp Asp Gly Cys 
20 25 30 

Arg He His Val He Asp His Ser Gly Ala Arg Leu Gly Glu Tyr Glu 
35 40 45 
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Asp Val Ser Arg Val Glu Lys Tyr Thr lie Ser Gin Glu Ala Tyr Asp 
50 55 60 

Gin Arg Gin Asp Thr Val Arg Ser Phe Leu Lys Arg Ser Lys Leu Gly 
65 70 75 80 

Arg Tyr Asn- Glu Glu Glu Arg Ala Gin Gin Glu Ala Glu Ala Ala Gin 

85 90 95 

Arg Leu Ala Glu Glu Lys Ala Gin Ala Ser Ser lie Pro Val Gly Ser 
100 105 110 

Arg Cys Glu Val Arg Ala Ala Gly Gin Ser Pro Arg Arg Gly Hhr Val 
115 120 125 

Mel: Tyr Val Gly Leu Thr fisp Fhe Lys Pro Gly Tyr Trp lie Gly Val 
130 135 140 

Arg Tyr Asp Glu Pio Leu Gly Lys Asn Asp Gly Ser Val Asn Gly Lys 
145 150 155 160 

Arg Tyr Fhe Glu Cys Gin Ala Lys Tyr Gly Ala Fhe Val Lys Pro Ala 
165 170 175 

Val Val Thr Val Gly Asp Phe Pro Glu Glu Asp Tyr Gly Leu Asp Glu 
180 185 190 

He 



(2) INFORMATION FOR SEQ ID NO: 5: ^ 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 base pairs 

(B) TYPE: nucleic acid - ' 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA(cDNA) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

ATGGAACTGG AGCTGTATGG AGTTGACGAC AAGTTCTACA GCAAGCTGGA TCAAGAGGAT 
GOGCTCCTGG GCTOCTAOOC TGTAGATGAC GGCTGOOGCA TOCAOGTCAT TGAOCACAGT 
OOOGCOOGOC TTGGTGAGTA TGAG6A0GTG TOCOOOGTGG AGAAGTACAC GATCTCACAA 
GAAGCCTACG AOCAGAGGCA AGACAOGGTC OGCTCTTTOC TGAAGOGCAG CAAGCTOGQC 
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OGGTACAAOG AGGAGGAGOG GGCTCAGCAG GAGGOOGAGG OCDGOOCAGOG CXnGGOCXSAG 
GAGAAGGOOC AGGOCAGCTC CATOCCCX5TG GGCAGOOGCT GTGAGGTGOG GQOGGOGGGA 
CAATOXXJrC OCXX9G00CAC OGTCATGTAT GTAOGTCTCA CAGATTTCAA GOCTGOOTAC 
T0GATT0GT6 T0CX9CTATGA TGAC90CACTG OG6AAAAAT6 ATGOCAGTCT GAATG06AAA 
OGCTACTTOG AATGOCAGGC CAAGTATGGC GCX2TTTGTCA AGOCAGCAGT CXSTGAOSGTG 
GOGGACTTOC OOGAOGAOQA CEAOGOGTTG GAOGAGATA 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1015 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA( genomic) 
(111) HYPOTHETICAL: NO 
(Iv) ANTI-SENSE: NO 

(Vll) IMMEDIATE SOURCE: 

(A) LIBRARY: Human fe-tal brain cDNA library 

(B) CLONE: GEN-080G01 

(Ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 274.. 852 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

TGATTGGTCA GGCAOGGAOC A06A0G000G CTGATAOOOC AOCAOCAGCA G0G0GGGC06 

CGGXJTGOGGA GOGGGTGTGA GGCGGCTGGA OCGOGCTGCA GGCAT0CX30G GGCX30QGCAA 

GATGXSAOGTG AOGGGGGTGT CGGCAOCAOG GTGAOOGTTT TCATCAGCAG CTOOCTCAGC 

AOCTTOOGCT OOGAGAAOOG ATACAGOOGC AGOCTCACCA TCGCTGAGTT CAAGTGTAAA 

CTGGAGTTGC TGGTGGGCAG COCTGCTTOC TGC ATG GAA CTG GAG CTG TAT GGA 

Met: Glu Leu Glu Leu Tyr Gly 
1 5 
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GTT GAC GAC AAG TTC TAC AGC AAG CTG GAT CAA GAG GAT GCG CTC CTTG 342 
Val Asp Asp Lys Fiva Tyr Ser Lys Leu Asp Gin Glu Asp Ala Leu Leu 
10 15 20 

GOC TGC TAC CX:T GTA GAT GAC GOC TOO CGC ATC CAC GTC ATT GAC CAC 390 
dy Ser Tyr Pro Val Asp Asp Gly Cys Azg lie His Val lie Asp His 
25 30 35 

' ACT OGC GCX: COG CTT GGT GAG TAT GAG GAC GTG TOO CGG GTG GAG AAG 438 
Ser 'Gly Ala Arg L^ Gly Glu Tyr Glu Asp Val Ser Axg Val Glu Lys 
40 45 - 50 55 

TAC AOG ATC TCA CAA GAA GOC TAC GAC CAG AGO CAA GAC AOS GTC CGC 486 
Tyr nxr He Ser Gin Glu Ala Tyr Asp Gin Axg Gin Asp Thr Val Arg 

60 65 70 

TCT TTC CTG AAG CGC AGC AAG CTC GGC CGG TAC AAC GAG GAG GAG CGG 534 
Ser Fhe Leu Lys Arg Ser Lys Leu Gly Arg Tyr Asn Glu Glu Glu Arg 
75 80 85 

OCT CAG CAG GAG GOC GAG GOC GOC CAG OGC CTG GOC GAG GAG AAG GOC 582 
Ala Oln Gin Glu Ala Glu Ala Ala Gin Arg Leu Ala Glu Glu Lys Ala 
90 95 100 

CAG GOC AGC TOC ATC OOC GTG GGC AGC CGC TGT GAG GIG CGG GOG GOG 630 
OLn Ala Ser Ser He Pro Val Gly Ser Axg (^s Glu Val Arg Ala Ala 
105 HO 115 

GGA CAA TGC OCT OGC OGG GGC AOC GTC ATG TAT GTA GGT CTC ACA GAT 678 
Gly Gin Ser Pro Axg Arg Gly Htxr Val Met Tyr Val Gly Leu Thr Asp 
120 125 130 ^135 

TTC AAG OCT GGC TAC TGG ATT GGT GTC OGC TAT GAT GAG OCA CTG GG6 726 
Fhe Lys Pro Gly Tyr Trp He Gly Val Arg Tyr Asp Glu Pro Leu Gly 
140 145 - 150 

AAA AAT GAT GGC AGT GTG AAT GGG AAA OGC TAC TTC GAA TGC CAG OCC 774 
Lys Asn Asp Gly Ser Val Asn Gly Lys Arg 'tyr Phe Glu Cys Gin Ala 
155 ' 160 165 

AAG TAT GGC GCC TTT GTC AAG OCA GCA GTC GTG AOG GTG GGG GAC TTC 822 
Lys Tyr Gly Ala Rie Val Lys Pro Ala Val Val Thr Val Gly Asp Phe 
170 175 180 

COG GAG GAG GAC TAC GGG TTG GAC GAG ATA TGACACCTAA GGAATTOOOC 872 
Pro Glu Glu Asp Tyr Gly Leu Asp Glu He 
185 190 



TGCTTCAGCT OCTAGCTCAG OCACTGACTG OOOCTOCTGT GTGTGOOCAT GGOOCTTTTC 932 



-105- 

TOCTGAOOOC ATTTTAATTT TATTCATTTT TTOCTTTGOC ATTGATITrT GAGACTCATG 992 

CATTAAATTC ACTAGAAAOC CAG 1015 

r 

(2) INFiC»MAJriCW FOR SBQ ID N0:7: 

(1) SEQUENCE CHARACTERISTICS: 

(A) I£NGTH: 128 amixx) aalds 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(11) mmOJlS TYPE: {»Dotein 

(xi) SEQUENCE DESCR1PTIC»I: SBQ ID N0:7: 

Met 73ir Glu Ala Asp Val Asn Pro li^s Ala Tyr Pro Leu Ala Asp Ala . 
15 10 15 

His Leu Thr Lys Lys Leu Leu Asp Leu Val Gin Gin Ser Cys Asn Tyr 
20 25 30 

Lys Gin Leu Axg Lys Gly Ala Asn Glu Ala Hxc Lys Utac Leu Asn Arg 
35 40 45 

Gly lie Ser Glu Pt^ lie Val Mel: Ala Ala A:^ Ala Glu Pro Leu Glu 
50 55 60 

lie lie Leu His Leu Pro Leu Leu Glu Asp Lys Asn Val Pro Tyr 
65 70 75 80 

Val Phe Val Arg Ser Lys Gin Ala TJsa Gly Axg Ala Cys Gly Val Sex^ 

85 90 95 

Arg Pro Val lie Ala C^ Ser Val Thr lie Lys Glu Gly Ser Gin Leu 
100 105 110 

Lys Gin Gin lie Gin Ser lie Gin Gin Ser lie Glu Arg Leu Leu Val 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 384 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA( genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

ATGACTGAGG CTGATGTGAA TOCAAAGGOC TATCOOCTTG OOGATGOOCA CCTCAOCAAG 
AAOCTACTOG AOCTOGTTCA GCAGTCATGT AACTATAAOC AOCTTOOGAA AOGAOOCAAT 
GAOOOCAOCA AAAOOCTCAA CAOGGOCATC TCTGAGTTCA TOGTGATGGC TOCAGAOGOC 
GAGOCACTGG AGATCATTCT GCAOCTGOOG CTGCTGTGTG AAGACAAGAA TGTGOOCTAC 
GTGTTTGTGC OCTOCAAGCA OOOCCTOOGG AGA0CCTGT6 GOGTCTOCAG OOCTGTCATC 
OOCTGTTCIG TCAOCATCAA AGAAOOCTOG CAOCTGAAAC AGCAGATCCA ATOCATTCAG 
CAGTOCATTG AAAGGCTCTT AGTC 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1493 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA( genomic ) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Human fetal brain cDNA library 

(B) CLONE: GBN-025F07 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 95. .478 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

ATOOGTGTOC TTG0CX3TGCT GGQCAGCAGA COCSTOCAAAC OGACACGGGT GGTATOCTOG 



CGGTGTOOQG CAAGAGACTA OCAAGACAGA OGCT ATG ACT GAG OCT GAT GTG 

Met Thr Glu Ala Asp Val 
1 5 



-107- 



AAT OCA AAG GCC TAT CXX: CTT GOC GAT GOC CAC CTG AOC AAG AAG CTA 160 
Asn Pro Lys Ala Tyr Pro Leu Ala Asp Ala His Leu Thr Lys Lys Leu 
10 15 20 

r 

CTG GAG CTC OTT GAG GAG TCA TGT AAC TAT AAG CAG CTT OGG AAA GGA 208 
Leu Asp Leu Val Gin Gin Ser Cys Asn Ty^^ Lys Gin Leu Arg Lys Gly 
25 30 35 

GOC AAT GAG GOC AOC AAA ADC CTC AAC AGG GOC ATC TCT GAG TTC ATC 256 
Ala Asn Glu Ala Thr Lys Thr Leu Asn Arg Gly He Ser Glu Phe lie 
40 - 45 50 

GTG ATG GOT GCA GAC GOC GAG OCA CTG GAG ATC ATT CTG CAC CTG OOG 304 
Val Mel: Ala Ala Asp Ala Glu Pro Leu Glu He lie Leu His Leu Pro 
55 60 65 70 

CTG CTG TGT GAA GAC AAG AAT GTG OOC TAG GTG TTT GTG OGC TOO AAG 352 
Leu Leu Cys Glu Asp Lys Asn Val Pro Tyr Val Phe Val Arg Ser Lys 

75 80 85 

CAG GOC CTG GGG AGA GOC TGT GGG GTC TOC AGG OCT GTC ATC GOC TGT 400 
Gin Ala Leu Gly Arg Ala Cys Gly Val Ser Arg Pro Val He Ala Cys 
90 95 100 

TCT GTC ADC ATC AAA GAA GGG TOG GAG CTG AAA GAG GAG ATC GAA TOC 448 
Ser Val Thr lie Lys Glu Gly Ser Gin Leu Lys Glxi Gin He Gin Ser 
105 110 115 

ATT CAG GAG TOC ATT GAA AGG CTC TTA GTC TAAAOCTGTG GOCTCTGOCA 498 
He Gin Gin Ser He Glu Arg Leu Leu Val 
120 125 

OGTGCTOOCT GOCAGCTTOC GOOCTGAGGT TGTGTATCAT ATTATCTGTG TTAGCATGTA 558 

GTATTTTCAG CTACTCTCTA TTGTTATAAA ATGTAGTACT AAATCTGGTT TCTGGATTTT 618 

TGTGTTGrrT TTGTTCTGTT TTACAGQGTT GCTATOOOOC TTOCTTTOCT OOCTOOCTCT 678 

GOCATOCTTC ATOCTTTTAT OCTOOCTTTT TGGAACAAGT GTTCAGAGCA GACAGAAGCA 738 

GGGTGGTGGC AOOGTTGAAA GGCAGA7>kAGA GOCAGGAGAA AGCTGATGGA GOCAGGACAG 798 

AGATCTGGTT OS^GCTTTCA GOGACTAGCT TOCTGTTGTG TGOGGGGTGT GGTGGAATTA 858 

AACAGCATTC ATTGTGTGTC OCTGTGOCTG GCACACAGAA TCATTCATAC GTGTTCAAGT 918 

GATCAAGGGG TTTCATTTGC TCTTGGGGGA TTAGGTATCA TTTQGGGAGG AAGCATGTGT 978 

TCTGTGAGGT TGTTOGGCTA TGTOCAAGTG TOGTTTACTA ATGTAOOOCT GCTGTTTGCT 1038 
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TTTGG?rAATG TGATGTTGAT GTTCTOOCXX: TAOCCACAAC aVTGX30CTTG AGGGTAGCAG 1098 

OGCAOCAOCA TA0CAAAGA6 ATGTOCTGCA 06ACT0006A 00CA00CT06 GTOGGIGAGC 1158 

CATGG00CA6 TTGAOCTOGG TCTTGAAAGA GT0GGGAGT6 ACAA0CTCA6 AGAOCATGAA 1218 

CTGATOCTOG CATGAAOGAT TOCAOGAAGA TCATG6AGAC CPGGCTGGTA GCTGTAACAG 1278 

AGATGGTGGA GTCCAAGGAA ACAGCCTGTC TCTGGTGAAT GGGACTTTCr TTGGTGGACA 1338 

CTTOGCACCA OCTCIGAGAG OOCTTOOOCT GTGTOCTOOC AOCATGIGGG TCAGATGTAC 1398 

TdCrGTCAC ATGAOGAGAG TOCTAGTTCA TGIGTTCTGC ATTCTTGTGA GCATOCTAAT 1458 

AAA!ICn?rTC CATTTTGAAA AAAAAAAAAA AAAAA 1493 

(2) INFCHRMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTEEUSTICS: 

(A) LJBNGTH: 711 amino aclcls 

(B) TYPE: amino acid 
(D) TOPOIiOGy: linear 

(±i) MOLBCOIiE TYPE: protiein 

(Xi) SBQqENCE DESCRIPTI(»I: SEQ ID NO: 10: 

Met Pro Ala Asp Val Asn Leu Ser Gin I^s Pro Gin Val lieu Gly Pro 
1 5 10 15 

Glu Lys Gin Asp Gly Ser Glu Ala Ser Val Ser Fhe Glu Asp Val 
20 25 30 

Thr Val Asp Phe Ser Arg Glu Glu Trp Gin Gin Leu Asp Pro Ala Gin 
35 40 45 

Arg Leu Tyr Arg Asp Val Met Leu Glu Leu Tyr Ser His Leu Phe 
50 55 60 

Ala Val Gly Tyr His lie Pro Asn Pro Glu Val lie E*e Arg Met Leu 
65 70 75 80 

Lys Glu Lys Glu Pro Arg Val Glu Glu Ala Glu Val Ser His Gin Arg 

85 90 95 

Cys Gin Glu Arg Glu Phe Gly Leu Glu lie Pro Gin Lys Glu lie Ser 
100 105 110 
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Lys Lys Ala Ser Phe Gin Lys Asp Mel: Val Gly Glu Phe itir Arg Asp 
115 120 125 

Gly Ser Trp Cys Ser lie Leu Glu Glu lisu Arg Leu Asp Ala Asp Afg 
130 135 140 

Thr Lys Lys Asp Glu Gin Asn Gin He Gin Pro Met Ser His Ser Ala 
145 150 155 160 

Fhe Fhe Asn Lys Lys Vtw JjBa Asn Utac Glu Ser Asn Qys Glu 

165 170 175 

Asp Pro Gly Lys Mat lie Arg Thr Arg Pro His Leu Ala Ser Ser Gin 
180 185 190 

Lys Gin Pro Gin Lys Cys Cys Leu Phe Itur Glu Ser Leu Lys Leu Asn 
195 200 205 

Leu Glu Val Asn Gly Gin Asn Glu Ser Asn Asp Thr Glu Gin Leu Asp 
210 215 220 

Asp Val Val Gly Ser Gly Gin Leu Wivz Ser His Ser Ser Ser Asp Ala 
225 * 230 235 240 

Qys Ser Lys Asn lie His Utac Gly Glu Thr Phe C^ Lys Gly Asn Gin 
245 250 255 

Qys Arg Lys Val Gly His Lys Gin Ser Leu Lys Gin His Gin He 
260 265 270 

His Thr Gin Lys Lys Pro A^ Gly Cys Ser Glu C^s Gly Gly Ser Fhe 
275 280 285 

Thr Gin Lys Ser His Leu Fhe Ala Gin Gin Arg lie His Ser Val Gly 
290 295 300 

Asn Leu His Glu Qys Gly Lys Qys Gly Lys Ala Fhe Met Pro Gin Leu 
305 310 315 320 

Xiys Leu Ser Val Tyr L^ Vtxc Asp His Hir Gly Asp He Pro Qj^s He 
325 330 335 

Cys Lys Glu Cys Gly Lys Val Phe He Gin Arg Ser Glu Leu Leu Uxr 

340 345 350 

His Gin Lys Thr His Thr Arg Lys Lys Pro Tyr Lys Cys His Asp 
355 360 365 

Gly Lys Ala Phe Phe Gin Met Leu Ser Leu Phe Arg His Gin Arg Thr 
370 375 380 
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Hls Ser Arg Glu Lys Leu Tyr Glu Cys Ser Glu Cys Gly Lys Gly Phe 
385 390 395 400 

Ser Gin Asn Ser Thr Leu He He His Gin Lys lie His Thr Gly Glu 
405 410 415 

Arg Gin Tyr Ala Ser Glu Cys Gly Lys Ala Phe Thr Gin Lys Ser 
420 425 430 

Thr Leu Ser L^ His Gin Acg He His Ser Gly Gin Lys Ser Tyr Val 
435 440 445 

cys He Glu Cys Gly Gin Ala Phe He Gin Ijys Ala His Leu He Val 
450 455 460 

His Gin Arg Ser His Thr Gly Glu Lys Pro Tyr Gin Cys His Asn Cys 
465 470 475 480 

Gly Lys Ser Phe He Ser Lys Ser Gin Leu Asp He His His Arg He 
485 490 495 

His Utac Gly Glu Lys Pro Tyr Glu Cys Ser Asp Cys Gly Lys Thr Phe 
500 505 510 

Thr Gin Lys Ser His Leu Asn He His Gin Lys He His Thr Gly Glu 
515 520 525 

Arg His His Val Cys Ser Glu Cys Gly Lys Ala Phe Asn Gin Lys Ser 
530 535 540 

He Leu Ser Met: His Gin Arg He His Thr Gly Glu Lys Pre Tyr Lys 
545 550 555 560 

cys Ser Glu Cys Gly Lys Ala Phe Itir Ser Iiys Ser Gin Phe Lys Glu 
565 570 - "575 

His Gin Arg He His Thr Gly Glu Lys Pro Tyr Val Cys Thr Glu Cys 
580 585 590 

Gly Lys Ala Kie Asn Gly Arg Ser Asn Rie His Lys His Gin He Thr 
595 600 605 

His Thr Arg Glu Arg Pro Phe Val Cys Tyr Lys Cys Gly Lys Ala Phe 
610 615 620 

Val Gin Lys Ser Glu Leu He Thr His Gin Arg Thr His Met Gly Glu 
625 630 635 640 



Lys Pro Tyr Glu Cys Leu Asp Cys Gly Lys Ser Rie Ser Lys Lys Pro 
645 650 655 
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Gln Leu Lys Val His Gin Arg He His Thr Gly Glu Arg Pro Tyr Val 
660 665 670 

Ser Glu Cys Gly Lys Ala Phe Asn Asn Arg Ser Asn Pt)& Asn LyS 
675 680 685 

His Gin Thr Thr His Thr Arg Asp Lys Ser Tyr Lys Cys Ser Tyr Ser 
690 695 700 

Val Lys Gly Hie Thr Lys Glxi 
705 710 • 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2133 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA( genomic ) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATOCCIGCTG ATGTGAATTT ATOCCAGAAG OCTCAGGTOC TGOGTOCAGA GAAGCAOGAT 60 

GGATCIT0C3G AOOCATCACT GTCATTT6AG GAOGTGACOG T0QACTTCA6 CAOGGAOGAG 120 

TGGCAGCAAC TGGADCCTGC OCAGAGATGC CTGTAOOGGG ATGTGATGCT GGAGCTCTAT 180 

AOCCAICTCT TCOCAGTOGG GTATCACATT OOCAAOOCAG AOGTCATCTT CAGAATOCTA 240 

AAAGAAAAG6 AOCXXSCGTGT GGAOGAOGCT GAAGTCTCAC ATCAGA08IG TCAAGAAAOG 300 

GAGTTTOOGC TT6AAAT00C ACAAAAG6AG ATTTCTAAGA AAOCTTCATT TCAAAAOGAT 360 

ATGGTAGGTG AGTTCACAAG AGATGGTTCA TGGTGTTOCA TTTTA6AAGA ACTGAGGCTG 420 

GATGCTGAOC GCACAAAGAA AGATGAGCAA AATCAAATTC AAOOCATGAG TCACAGTOCT 480 

TTCTTCAACA AGAAAACATT GAACACAGAA AGCAATTGTG AATATAAOGA OOCTOGGAAA 540 

ATGATTCQCA CGAGGCOCCA CCTTQCTTCT TCACAGAAAC AAOCJICAGAA ATGTTGCTTA 600 

TTTACAGAAA GTTTGAAGCT GAAOCTAGAA GTGAAOGGTC AGAATGAAAG CAATGACACA 660 

GAACAGCTTG ATGAOGTTGT TGGGTCTGGT CAGCTATTCA GOCATAGCTC TTCTGATGCC 720 



1 
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TOCAOCAAGA ATATTCATAC AOGAGAGACA. TTTT0CAAA6 GTAAOCAGTG TAGAAAAGTC 780 

TGTGGOCATA AACAGTCACT CAAQCAACAT CAAATTCATA CTCAGAAGAA AOCAGATOGA 840 

TGTTCTGAAT GTGGGGQGAG CTTCAODCAG AAGTCACAOC TCTTTGCX3CA ACAGAGAATT 900 

CATACTGTAG GAAAOCTCXZA TGAATCTGOC AAATGTGGAA AAGCXTTTCAT GCS^ACAACTA 960 

AAACTCAGTG TATATCTGAC AGATCATACA GGTGATATAC CXTTGTATATG CAAGGAATGT 1020 

GGGAAOGTCT TTATTCAGAG ATCAGAATT6 CTTAa3CACC AGAAAACACA CACTAGAAAG 1080 

AAOOOCTATA AATQOCATGA CTOTGGAAAA GOCTrTTTOC AGATGTTATC TCTCTTCAGA 1140 

CATCAGAGAA CTCACAGTAG AGAAAAACTC TATGAATOCA GTGAATGTG6 CAAAOOCTTC 1200 

TOXAAAACr CAAOOCTCAT TATACATCAG AAAATTCATA CTOGTGAGAG ACAGTATGCA 1260 

TOCAGTGAAT GTOGGAAAGC CTTTAOOCAG AAGTCAACAC TCAGCTTGCA CCAGAGAATC 1320 

CACrCAGGGC AGAAGTOCTA TCTCTGTATC GAATGOQGGC AGGOCTTCAT C3CAGA7«3GCA 1380 

CACXTTGATTG TOCATCAAAG AAGCX2ACACA GGAGAAAAAC CTTATCAGTG OCACAACTGT 1440 

GGGAAATOCT TCATTTOCAA GTCACAGCTT GATATACATC ATOSAATTCA TACAOGG6AG 1500 

AAACCTTAT6 AATGCAGTGA dX^TGGAAAA AOCTTCAOX: AAAAGTCACA CX7EGAATATA 1560 

CAOCAGAAAA TTCATACTGG AGAAAGACAC CATGTATGCA GTGAATGOOG GAAAGC3CTTC 1620 

AAOCAGAACTT CAATACICAG CATGCATCAG AGAATTCACA O00GAGAGAA-XXX7ITACAAA 1680 

TOCAGTGAAT GTOGGAAAGC CTTCACITCT AAGTCTCAAT TCAAAGAOCA TCAGOOAATT 1740 

GACAOOOGTO AGAAAOOCTA TGTGTOCACT GAATGTOOGA AOOOCTTCAA GGGCAOOTCA 1800 

AATTTOCATA AACATCAAAT AACTCACACT AGAGAGAGOC CTTTTTGrCTO TTACAAATGT 1860 

OOGAAGOCTT TTGTOCAGAA ATCAGAGTTG ATTAOOCATC AAAGAACTCA CAT006AGA6 1920 

AAAOOCTATO AAT00CTT6A CTGTGOGAAA TOGTTCAGTA AGAAAOCACA ACTCAAGOTO 1980 

CATCAOGGAA TTCACAOOOO AGAAAGAOCT TATGTGTGTT CTGAATOTOO AAAGOOCTTC 2040 
AACAACAGGT CAAACTTCAA TAAACACCAA ACAACTCATA OCAGAGACAA ATCTTACAAA 2100 
TGCAGTTATT CTGTGAAAGG CTTTAOCAAG CAA 2133 



(2) INFORMATION FOR SEQ ID NO: 12: 
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(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3754 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA( genomic) 
(ill) HYPOTHETICAL: NO 
(Iv) ANTI-SENSE: NO 

(vli) IMMEDIATE SOURCE: 

(A) LIBRARY: Human fetal brain cDNA library 

(B) CLONE: GEN-076C09 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 346.. 2478 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



GCTAAOOCTA TGTOGCTTAC TGGAOOCTGA AOIGATTGG6 AATATTAOCA GIGOGGGmC 60 

TGTAOOGTCA GGAAOOGGOG GCTOOCTTTG OGGGAGTGAT GAOOGOCTTG TTOOOOGTGG 120 

GGGTG0C3TGA TAAAGGGATT TCTCGGCTGA AGAOGAGGCT GTGAGGCTTC TGCAGRMXC 180 

CCAOGTCAGG OCACATCATT GAOOCTOCAG GATCICTCTT CATAGGOCAG TAOGACTCTC 240 

COGOGTGTOC CTGGTIXXSAA AATGCAAACA OCTATOCAGC TTCIOGCTOC TGGGAAAAGT 300 

GGAGTTGTCA OCAAGAGASA C0GA8AGTAG AAGC0CAGA6 TOGAG AT6 CCT GCT 354 

Met Pro Ala 
1 

GAT GTG AAT TTA TOC CAG ATVS OCT CAG GTC CDG GOT OCA GAG AAG CAG 402 
Asp Val Asn Leu Ser Gin Pro Gin Val Leu Gly Pro Glu Lys Gin 
5 10 15 

GAT GGA TCT TGC GAG GCA TCA GTG TCA TTT GAG GAC GTG ACC GTG GAC 450 
Asp GLy Ser Glu Ala Ser Val Ser Fhe Glu Asp Val Thr Val Asp 
20 25 30 35 

TTC AGC AGG GAG GAG TGG CAG CAA CTG GAC CCT GOC CAG AGA TGC CTG 498 
Rie Ser Arg Glu Glu Trp Gin Gin Leu Asp Pro Ala Gin Arg Cys leu 
40 ' 45 50 



TAG OGG GAT GTG ATG CTG GAG CTG TAT AGC CAT CTC TTC GCA GTG GGG 



546 
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Tyr Arg Asp Val Met Leu Glu Leu Tyr Ser His Leu Ftie Ala Val Gly 
55 60 65 

TAT CAC ATT CCC AAC CXIA GAG GTC ATC TTC AGA AT6 CTA AAA GAA AAG 594 
Tyr His lie Pro Asn Pro Glu Val lie Rie Arg Met: Leu Lys Glu Lys 
70 75 80 

GAG OOG OGT GT6 GAG GAG GCT GAA GTC TCA CAT CAG AGG TGT CAA GAA 642 
Glu Pro Azg Val Glu Glu Ala Glu Val Ser His Gin Arg C^s Gin Glu 
85 90 95 

A06 GAG TTT OGG CTT GAA ATC CCA CAA AAG GAG ATT TCT AAG AAA GCT 690 
Arg Glu Fhe Gly Leu Glu lie Pro Gin Lys Glu lie Ser Lys Lys Ala 
100 105 110 115 

TCA TTT CAA AAG GAT AT6 GTA GOT GAG TTC ACA AGA GAT GGT TCA TGG 738 
Ser Rie Gin Lys Asp Met Val Gly Glu Phe Utac Arg Asp Gly Ser Trp 
120 125 130 

TGT TCC ATT TTA GAA GAA CTG AGG CTG GAT GCT GAC CGC ACA AAG AAA 786 
Ser lie Leu Glu Glu Leu Arg Leu Asp Ala Asp Arg Thr Lys Lys 
135 140 145 

GAT GAG CAA AAT CAA ATT CAA CCC ATG AGT CAC AGT OCT TTC TTC AAC 834 
Asp Glu Gin Asn Gin lie Gin Pro Met Ser His Ser Ala Phe Phe Asn 
150 155 160 

AAG AAA ACA TTG AAC ACA GAA AOC AAT TGT GAA TAT AAG GAC CCT GGG 882 
Lys Lys Hit Leu Asn Thr Glu Ser Asn Cys Glu 1^ Lys Asp Pro Gly 
165 170 175 

AAA ATG ATT CGC ADG AGG CCC CAC CTT GCT TCT TCA CAG AAA CAA OCT 930 
Lys Mat He Arg Hvc Arg Pro His Leu Ala Ser Ser Gin Lys Gin Pro 
180 185 190 195 

CAG AAA TGT TGC TTA TTT ACA GAA AGT TTG AAG CTG AAC CTA GAA GTG 978 
Gin Lys <^s Cys Leu Phe Uir Glu Ser Leu Lys Leu Asn Leu Glu Val 
200 205 210 

AAC GGT CAG AAT GAA AOC AAT GAC ACA GAA CAG CTT GAT GAC GTT GTT 1026 
Asn Gly Gin Asn Glu Ser Asn Asp Thr Glu Gin Leu Asp Asp Val Val 
215 220 225 

GGG TCT GGT CAG CTA TTC AOC CAT AGC TCT TCT GAT GOC TGC AGC AAG 1074 
Gly Ser Gly Gin Leu Pt)& Ser His Ser Ser Ser Asp Ala Cys Ser Lys 
230 235 240 

AAT ATT CAT ACA OGA GAG ACA TTT TGC AAA GGT AAC CAG TGT AGA AAA 1122 
Asn He His Thr Gly Glu Thr Hie Cys Lys Gly Asn Gin Cys Arg Lys 
245 250 255 
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GTC TOT GGC CAT AAA CAG TCA CTC AAG CAA CAT CAA ATT CAT ACT GAG 1170 
Val Cys Gly His Lys Gin Leu Lys Glxi His Gin He His Thr Gin 
260 265 270 275 

AAG AAA CX:A GAT GGA TGT TCT GAA TGT GGG GGG PJ3C TTC ACC CAG AAG 1218 
Lys lys Pro Asp Gly Cys Ser Glu Cys Gly Gly Ser Phe Thr Gin Lys 
280 285 290 

TCA CAC CTC TTT GOC CAA CAG AGA ATT CAT AGT GTA GGA AAC CTC CAT 1266 
Ser His Leu Phe Ala Gin Gin Arg lie His Ser Val Gly Asn Leu His 
295 300 305 

GAA TGT GGC AAA TGT GGA AAA GOC TTC ATG CCA CAA CTA AAA CTC AGT 1314 
Glu Cys Gly Lys C^ Gly Lys Ala Phe Met: Pro Gin Leu Lys Leu Ser 
310 315 320 

GTA TAT CTG ACA GAT CAT 7VCA GGT GAT ATA OOC TGT ATA TGC AAG GAA 1362 
Val Tyr Leu Hhr Asp His Htw Gly Asp lie Pro Cys He Cys Lys Glu 
325 330 335 

TGT GGG AAG GTC TTT ATT CAG AGA TCA GAA TTG CTT AOG CAC CAG AAA 1410 

Gly Lys Val Phe He Gin Arg Ser Glu Leu Leu Thr His Gin Lys 
340 345 350 355 

ACA CAC ACT AGA AAG AAG COC TAT AAA TGC CAT GAC TGT GGA AAA- GOC 1458 
Thr His Thr Arg Lys Lys Pro Tyr Lys Cys His Asp Cys Gly Lys Ala 
360 365 370 

TTT TTC CAG ATG TTA TCT CTC TTC AGA CAT CAG AGA ACT CAC AGT AGA 1506 
Phe Phe Gin Met Leu Ser Leu Phe Arg His Gin Arg Thr His Ser Arg 
375 380 385 

GAA AAA CTC TAT GAA TGC AGT GAA TGT GGC AAA GGC TTC TOC CAA AAC 1554 
Glu L^ Leu Tyr Glu Cys Ser Glu Gly Lys Gly Phe Ser Gin Asn 
390 395 400 

TCA ACC CTC ATT ATA CAT CAG AAA ATT CAT ACT GGT GAG AGA CAG TAT 1602 
Ser Thr Leu He He His Gin Lys He His Thr Gly Glu Arg Gin Tyr 
405 410 415 

GCA TGC AGT GAA TGT GGG AAA GOC TTT ACC CAG AAG TCA ACA CTC AGO 1650 
Ala C^ Ser Glu Gly Lys Ala Phe Ttir Gin Lys Ser Thr Leu Ser 
420 425 430 435 

TTG CAC CAG AGA ATC CAC TCA GGG CAG AAG TOC TAT GTG TGT ATC GAA 1698 
Leu His Gin Arg He His Ser Gly Gin Lys Ser Tyr Val Cys He Glu 
440 445 450 

TGC GGG CAG GOC TTC ATC CAG AAG GCA CAC CTG ATT GTC CAT CAA AGA 1746 
Cys Gly Gin Ala Rie He Gin Lys Ala His Leu He Val His Gin Arg 
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455 460 465 

AGC CAC ACA GGA GAA AAA OCT TAT GAG TGC CAC AAC TGT GGG AAA TOC 1794 
Ser His Thr Gly Glu Pro Gin cys His Asn Cys Gly Lys Ser 
470 • 475 480 " 

TTC ATT TCX: AAG TCA CAG CTT GAT ATA CAT CAT CGA ATT CAT ACA GGG 1842 
Fhe lie Ser Lys Ser Gin Leu Asp lie His His Arg lie His Hu: Gly 
485 490 495 

GAG AAA OCT TAT GAA TGC AGT GAC TGT GGA AAA AOC TTC AOC CAA AAG 1890 
Glu Pro Tyr Glu Cys Ser Asp Cys Gly Lys Utiac Fhe Htxc Gin Lys 
500 505 . 510 515 

TCA CAC CIG AAT ATA CAC CAG AAA ATT CAT ACT GGA GAA AGA CAC CAT 1938 
Ser His Leu Asn He His Gin Lys He His Hvc Gly Glu Arg His His 
520 525 530 

GTA TOC AGT GAA TGC GGG AAA GGC TTC AAC CAG AAG TCA ATA CTC AGC 1986 
Val C^ Ser Glu Gly Lys Ala Ftie Asn Gin Lys Ser He Leu Ser 
535 540 545 

ATG CAT CAG AGA ATT CAC AOC GGA GAG AAG CCT TAG AAA TGC AGT GAA 2034 
Met: His Gin Arg He His Thr Gly Glu L^^ Pro Tyr Lys C^ Ser Glu 
550 555 560 

TGT GGG AAA GOG TTC ACT TCT AAG TCT CAA TTC AAA GAG CAT CAG CGA 2082 
C^ Gly L^ Ala Fhe Thr Ser Lys Ser Gin Fhe Lys Glu His Gin Arg 
565 570 575 

ATT CAC AOG GGT GAG AAA GGC TAT GTG TOG ACT GAA TGT GOg AAG OGG 2130 
He His Ttac Gly Glu Lys Pro Tyr Val Qys Ttvr Glu Qys Gly L:^ Ala 
580 585 590 595 

TTC AAC GGC AGG TCA AAT TTC CAT AAA CAT CAA ATA ACT CAC ACT AGA 2178 
Fhe Asn Gly Azg Ser Asn Fhe His Lys His Gin He Ihr His Thr Arg 
600 605 610 

GAG AGG OCT TTT GTG TGT TAG AAA TGT GGG AAG GCT TTT GTG CAG AAA 2226 
Glu Azg Pro Ptva Val Tyr Lys Cys Gly Lys Ala Fhe Val Gin Lys 
615 620 625 

TCA GAG TTG ATT AOG CAT CAA AGA ACT CAC ATG GGA GAG AAA COG TAT 2274 
• Ser Glu Leu He Thr His Gin Arg Thr His Met: Gly Glu Lys Pro Tyr 
630 635 640 

GAA TGC CTT GAC TGT GGG AAA TGG TTG AGT AAG AAA OCA CAA CTC AAG 2322 
Glu Cys Leu Asp Cys Gly Lys Ser Fhe Ser Lys Lys Pro Gin Leu Lys 
645 650 655 
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GTG CAT CAG OGA ATT CAC AOG GGA GAA AGA OCT TAT GTG TGT TCT GAA 2370 
Val His Gin Arg He His Thr Gly Glu Arg Pro Tyr Val Cys Ser Glu 
660 665 670 675 

r 

TGT GGA AAG GOC TTC AAC AAC AGG TC3V AAC TTC AAT AAA CAC CAA ACA 2418 
Cys Gly Lys Ala Phe Asn Asn Arg Ser Asn Phe Asn Lys His Gin Thr 
680 685 690 

ACT CAT AOC AGA GAC AAA TCT TAG AAA TGC AGT TAT TCT GTG 7VAA GGC 2466 
Thr His Thr Arg Asp Lys Ser Tyr Lys Cys Ser Tyr Ser Val Lys Gly 
695 700 705 

TTT AOC AAG CAA TGAATTOCTA GTGCATCAGC ATATTCATAA ATGAAATATA 2518 
Phe Hkxc Lys Gin 
710 

CTOCGAGTTT CTTGAAGAAG AGAACATCTT CTCAGAATCA GGTCTAATTA TATGTTATTG 2578 

AATTCATGCT TCAGAAAAAC TCTAOOGATG CACTOCATCT GT6AACACAT GATAAAAAAG 2638 

TCATGCTTTA TTTTAGTGAG GGCAATTACA GAGAAAAGAG TAAGCAGAAA TGTOCTTCTG 2698 

AGTACTGQOC TCATTAAGGA TTATAAATTT TCTOOOOGGG A/s^GA/^AOOCT GACTAAOGCA 2758 

TTGAGAAAAG OCTTTCTGTA AAGAATOGTA CAAGACAOGT TGTTACT06A TTATTTATAG 2818 

TAAAATATGT GOGAZ^TTAT ATCAATGATA AOOCTGrTTA TTGTGGGATA TCAATATTTT 2878 

TAAAGTOOCA ACACAGTCAT GATAGGACAA TATTTTATGT GTGTGTGTGC GOCTTATGTA 2938 

TATAAOCATA TATATAATAT ATAAGCATAT TATTATAT7VC AGGTTGAGTA TOOCTTCTOC 2998 

AAAATOOCTG GGATCAGAAG CATTTTOGAT TTCAGATACT TACAGATTTT 0GAATATTT6 3058 

CATTATATTT ATTOGTTGAG CATOOCTAAT CTGAAAATCC AAGATTAAAT GCTOCAATTA 3118 

GCATTTOCTT TGAGOSTCAT GrTAGAGTTC AAAAAGTTTC AGATTTTQGG TTTTCAGATT 3178 

AGGAATAOOC AAOCTGTATG TAOGTATATT TCTGTATCTA TGTATGTATA TATATGCATA 3238 

TGCAGACATA TGTATATGGPT CTGGTCAGCA TATGTGTATG TATGOGTATG TATGTATGTA 3298 

TGTATGOOCT CAGTGCAGTG GGGTTTGCTG CAGAATTCAC TGCATAGCAG GAGATGTAAG 3358 

CAGATGAGTT ATTTTTTAAG AGAATCTAAT CTAATTGTTT TTATAAAAAT TATTOOCTAT 3418 

TGAATATTTA TATAATGAGG TTGTATCAAC AATGATTAAC TCCTTTATTA TACATACACA 3478 
TGAATGTGCA TTTTTGGTAA ATGCATAAAT GAGATTCTAT AATGTTTACT GATCTTTATA 3538 
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TTACAGATTT TCTCTrCTTT TAGGATTAGC TCAGCTTGOC COOOCTTTCX: ATCTOCAOCA 3598 

TCTATAGTGA GCXyrCTOCAT AATTAGTOOC AAOCATTAGT CTOGTTCATA TTTTTACAOC 3658 

AOGAGTCAAC AAACTGTGCX: ATTOGOCAAA TATGGOCTOC O^ACTCTTTT TTTAAAATAA 3718 

AGTTTTATTG GAACACAAAA AAAAAAAAAA AAAAAA 3754 

(2) INPORMATICW FOR SEQ ID N0:13:* 

(i.) SEQUENCE CHARACTEEaSTICS: 

(A) m«3TH: 389 amino acids 

(B) TYPE: amino acid 
(D) TOFODOGY: Hn^ar 

(ii) M3LB0ULE TYPE: protein 

(xi) SEQUENCE E^SCRZPTION: SEQ ID N0:13: 

Met Ala Asp Pro Arg Asp Lys Ala Leu Gin Asp Tyr Arg Lys Lys Leu 
15 10 15 

Leu Glu His Lys Glu lie Asp GLy Azg Leu Lys Glu Leu Arg Glu Gin 
20 25 30 

leu Lys Glu Leu Thr Lys Gin Tyr Glu Lys Ser Glu Asn Asp Leu Lys 
35 40 45 

Ala Leu Gin Ser Val Gly Gin lie Val Gly Glu Val Leu Lys»Gln Leu 
50 55 60 

Uir Glu Glu Lys Phe lie Val Lys Ala Uir Asn GLy Pro Arg Tyr Val 
65 70 75 - 80 

Val Gly Cys Arg Arg Gin Leu Asp Lys Ser Lys Leu Iiys Pro Gly Utac 

85 90 95 

Arg Val Ala Leu Asp Met: Thr Thr Leu Thr lie Met Arg Tyr Leu Pro 
100 105 110 

Arg Glu Val Asp Pro Leu Val Tyr Asn Met Ser His Glu Asp Pro Gly 
115 120 125 

Asn Val Ser Tyr Ser Glu lie Gly Gly Leu Ser Glu Gin lie Arg Glu 
130 135 140 

Leu Arg Glu Val lie Glu Leu Pro Leu Thr Asn Pro Glu Leu Phe Gin 
145 150 155 160 
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Arg Val Gly lie lie Pro Pro Lys Gly Leu Leu Tyr Gly Pro Pro 
165 170 175 

Gly Thr Gly Lys Thr Leu Leu Ala Arg Ala Val Ala Ser Gin Leu A^ 
180 185 190 

Cys Asn Rie Leu Lys Val Val Ser Ser Ser lie Val Asp Lys Tyr lie 
195 200 205 

Gly du Ser Ala Arg Leu lie Arg Glu Met Phe Asn Tyr Ala Arg Asp 
210 215 220 

His Gin Pro He lie Phe Met Asp OLu lie Asp Ala lie Gly Gly 
225 230 235 240 

Arg Arg Phe Ser Glu Gly Thr Ser Ala Asp Arg Glu lie Gin Arg Thr 
245 250 255 

Leu Met Glu Leu Leu Asn Gin Met Asp Gly^ Rie Asp Ttac Leu His Arg 
260 265 270 

Val tys Met Thr Met Ala Thr Asn Arg Pro Asp The Leu Asp Pro Ala 
275 280 285 

Leu Leu Arg Pro Gly Arg Leu Asp Arg Lys lie His He Asp Leu Pro 
290 295 300 

Asn Glu Gin Ala Arg Leu Asp lie Leu Lys lie His Ala Gly Pro He 
305 310 315 320 

Thr Lys His Gly Glu He Asp Tyr Glu Ala He Val Lys Leu Ser Asp 
325 330 335 

Gly Phe Asn Gly Ala Asp Leu Arg Asn Val Cys Thr Glu Ala Gly Met 
340 345 350 

Phe Ala He Arg Ala Asp His Asp Phe Val Val Gin Glu Asp Phe Met 
355 360 365 

Lys Ala Val Arg Lys Val Ala Asp Ser Lys Lys Leu Glu Ser Lys Leu 
370 375 380 



Asp Tyr Lys Pro Val 
385 



(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1167 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDBDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA( genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

ATGGOOGAOC CTAGAGATAA OGOGCTTCAG GACTACCGCA AGAAGTTOCT TGAACACAA6 60 

GAGATOGACG OCXSGTCTTAA. 06AGTTAA06 GAACAATTAA. AAGAACTTAC CAAOCAGTAT 120 

GAAAAGTCI6 AAAATGATCT GATVOOOOCTA CAGAOTGTTG GGCAGATCXTT GOGTGAAGTG 180 

CTTAAACAGT TAACTGAAGA AAAATTCATT GTTAAAOCTA CX:AAT06ACC AAGATATGTT 240 

GTGOGnTGrrC GTOGACAOCT TGACAAAAGT AAOCTGAAOC CAGGAACAAG AGTTGCTTTG 300 

GATATGACTA CACTAACTAT CATGAGATAT TTQCOGAGAG TUGGTGGATOC ACTGGTTTAT 360 

AACATGTCTC ATGAOGACOC TGOGAATGTT TCTTATTCTG AGATTGGAOG OCTATCAGAA 420 

CAGATOOOGG aattaagaga ogtgatagaa tta£x:tctta CAAAOOCAGA GTTATTTCAG 480 

GGTGTAGGAA TAATAOCrCC 7\AAAGGCTGT TTGTTATATG GAOCAOCAGG TAOGGGAAAA 540 

ACACTCTTQ6 CAOGAGCOGT TGCTAGOCAG CTGGACTGCA ATTTCTTAAA GGTTGTATCT 600 

AGTTCTATTG TA6ACAAGTA CATTOGTGAA AGTGCTCGrT TGATCAGAGA AATGTTTAAT 660 

TATOCTAGAG ATCATGAAOC ATOCATCATT TTTATGGAT6 AAATAGATGC TATTOGTOGT . 720 

GGTGOGTTTT CTGAGOGTAC TTCAGCTGAC AGAGAGATTC AGAGAAOGTT AATGGAGTTA 780 

CTGAATCAAA TOGATOGATT TGATACTCTG CATAGAGTTA T^TGACCAT GGCTACAAAC 840 

AGACCAGATA CACIGGATOC TGCTTTGCTG CGTOCAGGAA GATTAGATAG AAAAATACAT 900 

ATTGATTTOC CAAATGAACA AGCAAGATTA GACATACIGA AAATOCATGC AGGTCXX3VTT 960 

ACAAAGCATG GTGAAATAGA TTATGAAOCA ATTC?PGAAGC TTTOGGATGG CTTTAATGGA 1020 

GCAGATCTGA GAAATGTTTG TACTGAAGCA GGTATGTTOG CAATTOGTGC TGATCATGAT 1080 

TTTGTAGTAC AGGAAGACTT CATGAAAGCA GTCAGAAAAG TGQCTGATTC TAAGAAGCTG 1140 

GAGTCTAAAT TOGACTACAA AOCTTGTG 1167 
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(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1566 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(±ii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Human fetal brain cDNA library 

(B) CLONE: GEN-331G07 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 17.. 1183 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



GAGAOOOCTT CTCATC ATG GCG GAC OCT AGA GAT AAG 006 CTT GAG GMZ 

Met Ala Asp Pro Arg Asp Lys Ala Leu Gin Asp 
15 10 

TAC OOC AAG AAG TTG CTT GAA CAC AAG GAG ATC GAC OGC OGT CTT AAG 
Tyr Arg Lys Lys Leu Leu Glu His Glu lie Asp Gly Arg Leu 
15 20 25 

GAG TTA AGG GAA CAA TTA AAA GAA CTT AOC AAG CAG TAT GAA AAG TCT 
Glu Leu Arg Glu Gin L^ Glu Leu VOw lys Gin l^r Glu lys Ser 
30 35 40 

GAA AAT GAT CTG AAG GOC CTA CAG AGT GTT GGG CAG ATC GTG GGT GAA 
Glu Asn Asp Leu Lys Ala Leu Gin Ser Val Gly Gin lie Val Gly Glu 
45 50 55 

GTG CTT AAA CAG TTA ACT GAA GAA AAA TTC ATT GTT AAA OCT AOC AAT. 
Val Leu lys Gin Leu Ttur Glu Glu Lys Phe lie Val Lys Ala Thr Asn 
60 65 70 75 

GGA OCA AGA TAT GTT GTG GGT TGT OGT OGA CAG CTT GAC /kAA AGT AAG 
Gly Pro Arg Tyr Val Val Gly Cys Arg Arg Gin Leu Asp Lys Ser Lys 

80 85 90 



CTG AAG OCA GGA ACA AGA GTT OCT TTG GAT ATG ACT ACA CTA ACT ATC 



> 
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Leu Lys Pro Gly Thr Ascg Val Ala Leu Asp Mat Thr Thr Leu Thr lie 
95 100 105 

ATG AGA TAT TTG COG AGA GAG GTG GAT OCA CTG GTT TAT AAC ATG TOT 385 
Met Arg Tyr Leu Pro Arg Glu Val Asp Pro Leu Val Tyr Asn Met Ser 
110 115 120 

CAT GAG GAG OCT GOG AAT GTT TCT TAT TOT GAG ATT GGA GGG OTA TCA 433 
His Glu Asp Pro Gly Asn Val Ser Tyr Ser Glu lie Gly Gly Leu Ser 
125 130 • 135 

GAA GAG ATC O06 GAA TTA AGA GAG GTG ATA GAA TTA OCT CTT ACA AAC 481 
Glu Gin lie Axg GLu L»i Azg Glu Val lie Glu Leu Pro Leu Ttac Asn 
140 145 150 155 

OCA GAG TTA TTT GAG OGT GTA GGA ATA ATA OCT OCA AAA GGC TGT TTG 529 
Pro Glu Leu Phe Gin Arg Val Gly lie lie Pro Pro Lys Gly Leu 
160 165 170 

TTA TAT GGA OCA OCA GGT AOG GGA AAA ACA CTC TTG GCA OGA GCC GTT 577 
Leu Tyr Gly Pro Pro Gly Htsc Gly Lys Tlir Leu Leu Ala Arg Ala Val 
175 180 185 

OCT AGC GAG GIG GAC TGC AAT TTG TTA AAG GTT OTA TCT ACT TCT ATT 625 
Ala Ser Gin Leu Asp cys Asn Ftie Leu Lys Val Val Ser Ser Ser lie 
190 195 200 

OTA GAC AAG TAG ATT GOT GAA ACT GCT OCT TTG ATC AGA GAA ATG TTT 673 
Val Asp Lys Tyr lie Gly Glu Ser Ala Arg Leu lie Arg Glu V!est Fhe 
205 210 215 

AAT TAT OCT AGA GAT CAT CAA OCA TGC ATC ATT TTT ATG GAT GAA ATA 721 
Asn T^r Ala Arg Asp His Gin Fro Qys He He Met Asp Glu lie 

220 225 230 235 

GAT GCT ATT GCT GCT OCT OGG TTT TCT GAG GCT ACT TCA GCT GAC AGA 769 
Asp Ala He Gly Gly Arg Arg Phe Ser Glu Gly Thr Ser Ala Asp Arg 
240 245 250 

GAG ATT GAG AGA AOG TTA ATG GAG TTA CTG AAT CAA ATG GAT GGA TTT 817 
. Glu He Gin Arg Hut Leu Mel: Glu Leu Leu Asn Gin Met Asp Gly Phe 
255 260 265 

GAT ACT CTG CAT AGA GTT AAA ATG ADC ATG GCT ACA AAC AGA OCA GAT 865 
Asp Thr Leu His Arg Val Lys Met Thr Met Ala Thr Asn Arg Pro Asp 
270 275 280 

ACA CTG GAT (XT GCT TTG CTG OCT OCA GGA AGA TTA GAT AGA AAA ATA 913 
Thr Leu Asp Pro Ala Leu Leu Arg Pro Gly Arg Leu Asp Arg Lys He 
285 290 295 
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CAT ATT GAT TTG OCA AAT GAA CAA GCA AGA TTA GAG ATA CTG AAA ATC 961 
His lie Asp Leu Pro Asn Glu Gin Ala Arg Asp lie Leu Lys lie 
300 305 310 315 

r 

C3VT GCA GGT CXX: ATT ACA AAG CAT GOT GAA ATA GAT TAT GAA GCA ATT 1009 
His Ala Gly Pro lie Thr Lys His Gly Glu lie Asp Tyr Glu Ala lie 
320 325 330 

GTG AAG CTT TOG GAT GGC TTT AAT GGA GCA GAT CTG AGA AAT GTT TGT 1057 
Val Lys Leu Ser Asp Gly Phe Asn Gly Ala Asp Leu Arg Asn Val Cys 
335 340 345 

ACT GAA GCA GGT ATG TTC GCA ATT GGT GCT GAT CAT GAT TTT GTA GTA 1105 
Thr Glu Ala Gly Met: Phe Ala lie Arg Ala Asp His Asp Phe Val Val 
350 355 360 

CAG GAA GAC TTC ATG AAA GCA GTC AGA AAA GTG GCT GAT TCT AAG AAG 1153 
Gin Glu Asp Phe Met Lys Ala Val Arg Lys Val Ala Asp Ser Lys Lys 
365 370 375 

CTG GAG TCT AAA TTG GAC TAC AAA OCT GTG TAATTTACTG TAAGATTTTT 1203 
Leu Glu Ser Lys Leu Asp Tyr Lys Pro Val 
380 385 

GATOOCTOCA TGACAGATGT TGGCTTATTG TAAAAATAAA GTTAAAGAAA ATAATGTATG 1263 

TATTOGCAAT GATGTCATTA AAAGTATATG AATAAAAATA T6AGTAACAT CATAAAAATT 1323 

AGTAATTCAA CTTTTAAGAT ACAGAAGAAA TTTGTTATGTT TGTTAAAGTT GCATTTATTG 1383 

CAGCAAGTTA CAAAGGGA7A GTGTTGAAGC TTTTCATATT TGCTGOGTGA GCATTTTGTA 1443 

AAATATTGAA AG?rGGTTTGA GATAGTGGTA TAAGAAAGCA TTTCTTATGA CTTATTTTGT 1503 

ATCATTTGTT TTOCTCATCT AAAAAGTTGA ATAAAATCTG TTTGATTCAG TTCTOCTAAA 1563 

AAA 1566 



(2) INFORMATION FOR SBQ ID NQ:16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 223 amino acids 

(B) TYPE: amim acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 16: 
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Mel: Ser Asp Glu Glu Ala Axg Gin Ser Gly Gly Ser Ser Gin Ala Gly 
15 10 15 

Val Val Ttxr Val Ser Asp Val Gin Glu lieu Mel: Arg Arg Ijys Glu Glu 
20 25 30 

lie ^u Ala Gin lie Lys Ala Asn Tyr Asp Val Leu Glu Ser Gin Dys 
35 40 45 

Gly lie Gly Met Asn Glu Pro Leu Val Asp Qys Glu Gly Tyr Pro Arg 
50 55 60 

Ser Asp Val Asp Leu Tyr Gin Val T^rg Ttvc Ala Arg His Asn lie He 
65 70 75 80 

Cys Leu Gin Asn Asp His Lys Ala Val Met: Lys Gin Val Glu Glu Ala 

85 90 95 

Leu His Gin Leu HdLs Ala Arg Asp Lys Glu Lys Gin Ala Arg Asp Met 
100 105 110 

Ala Glu Ala His Lys Glu Ala Met Ser Arg Lys Leu Gly Gin Ser Glu 
115 120 125 

Ser Gin Gly Pro Pro Arg Ala Fhe Ala Lys Val Asn Ser Xle Ser Pro 
130 135 140 

Gly Ser Pro Ala Ser lie Ala Gly Leu Gin Val Asp Asp Glu Xle Val 
145 150 155 160 

Glu Phe Gly Ser Val Asn Htw Gin Asn Flie Gin Ser Leu His Asn He 
165 170 175 

Gly Ser Val Val Gin His Ser Glu Gly Lys Pro Leu Asn Val TSxc Val 
180 185 190 

Xle Arg Arg Gly Glu Lys His Gin Leu Arg Leu Val Pro Hhr Arg Trp 
195 200 205 



Ala Gly Lys Gly Leu Leu Gly Cys Asn He He Pro Leu Gin Arg 
210 215 220 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 669 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA( genomic) 

t- 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

ATGT00GAC6 AOGAAOCGAG GCAGAGOGGA GGCTCCTOGC AOOGOOGGGT CGTGACIGIC 
AQOGAOQTCC AGGAOCTGAT OOGGOGCAAG GA06ASATAG AAOOOCAGAT CAAOGOCAAC 
TATGAOGTGC TGGAAAGOCA AAAAGGCATT GGGATGAAOG AGCC3GCTGGT GGACTGTGA6 
GGCTACOOOC OGTCASAOGT GGACCTGTAC CAAGTOCGCA COOOCAOGCA CAACAiEGAXA 
TGCCTGCRGA ATGATCACAA GGCAGTGATG AAGCAGGTGG AGGAGGOGCT GCADCAGCTG 
CAOGCTOGOG ACAAOGASAA GCA0G0CX3GG GACATGGCT6 AGGOQCACAA AGAOGOCAT6 
AGOOGCAAAC TGGGTCAGA6 TGAGAGCXaG GGOOCTOCAC GGGCCTTGGC CAAAGTGAAC 
AGCATCAGOC COGGCTCOOC AGCX3VQCATC GOGGGTCTGC AAGTGGATGA TGAGATTGTG 
GAGTTOGGCT CTGTGAACAC GCA6AACTTC CAGTCACTGC ATAACATTGG CAGTGTGGTG 
CAOCACACTG AOOGGAAGOC OCTGAATGTG ACAGTGATOC GCAQQGOGGA AAAACA0CA6 
CTTAGACTTG TTOCAACAOG CTQGGCAGGA AAAGGACTGC TGGGCTGCAA CATTATTOCT 
CTGCAAAGA 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Human fetal brain cDNA library 

(B) CLONE: GEN-163D09 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 125.. 793 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



ACTGTTCTOG OGTTOGOOGA COOCTGTGGT GTTTTOGOOC AT00G00GA6 GGTAGTTACG 

GTOGACTGOG OOGTOGTOOC TAO0O00OC3A 00030SICIC TGGAGT0006 OOQOOOGGIT 

CACG ATG TOC GAC GAG GAA GOG AOG CAG AOC GGA OGC TOC TCG CAS OOC 
Met Ser Asp Glu GLu Ala Azg Gin Ser Gly Gly Ser Ser Gin Ala 
15 10 15 

GGC GTC GTG ACT GTC AGC GAC GTC CAG GAG CTG ATG OGG OGC AAG GAG 
Gly Val Val Thr Val Ser Asp Val Gin Glu Leu Met Axg Arg Lys Glu 

20 25 30 

GAG ATA GAA GOG CAG ATC AAG GGC AAC TAT GAC GTG CTG GAA AGC CAA 
^u lie GLu Ala Gin lie Lys Ala Asn T^ Asp Val Leu Glu Ser Gin 
35 40 45 

AAA GGC ATT GGG ATG AAC GAG COG CTG GTG GAC TGT GAG GGC TAC COC 
L<^ Gly lie Gly Met Asn Glu Pro Leu Val Asp Glu Gly Tyr Pro 
50 55 60 

COG TCA GAC GTG GAC CTG TAC CAA GTC OGC AOC GCC AGG CAC AAC ATC 
Azg Ser Asp Val Asp Leu Tyr Gin Val Arg Thr Ala Azg His Asn lie 
65 70 75 

ATA TGC CTG CAG AAT GAT CAC AAG GCA GTG ATG AAG CAG GTG GAG GAG 
lie Cys Leu Gin Asn Asfp His Lys Ala Val Met L^s Gin Val Glu Glu 
80 85 90 95 

GOC CTG CAC CAG CTG CAC GCT OGC GAC AAG GAG AAG CAG GOC CX3G GAC 
Ala Leu His GIA Leu His Ala Azg Asp Lys Glu Lys Gin Ala Arg Asp 
100 105 110 

ATG GOT GAG GOC CAC AAA GAG GOC ATG AGC OGC AAA CTG GGT CAG AGT 
Met Ala Glu Ala His Lys Glu Ala Met Ser Arg Lys Leu Gly Gin Ser 
115 120 125 

GAG AGC CAG GGC OCT OCA OGG GOC TTC GOC AAA GTG AAC AGC ATC AGC 
Glu Ser Gin Gly Pro Pro Arg Ala R«e Ala Lys Val Asn Ser lie Ser 
130 135 140 

OOC GGC TOC OCA GOC AGC ATC GOG GGT CTG CAA GTG GAT GAT GAG ATT 
Pro Gly Ser Pro Ala Ser lie Ala Gly Leu Gin Val Asp Asp Glu lie 
145 150 155 
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GTG GAG TTC GGC TCT GTG AAC AOC CAG AAC TTC CAG TCA CK3 CAT AAC 649 

Val Glu Phe Gly Ser Val Asn Thr Gin Asn Phe Gin Ser Leu His Asn 

160 165 170 175 

r 

ATT GGC AGT GTG GTG CAG CAC AGT GAG GGG AAG CCC CTG AAT GTG ACA ' 697 
lie Gly Ser Val Val Gin His Ser Glu Gly Lys Pro Leu Asn Val Thr 
180 185 190 

GTG ATC CX3C AGG GGG GAA AAA CAC CAG CTT AGA CTT GTT OCA 7^ GGC 745 
Val He Arg Arg Gly Glu Lys His Gin Leu Arg Leu Val Pro Thr Arg 
195 200 205 

TGG OCA GGA AAA GGA CTG CTG GGC TGC AAC ATT ATT OCT CTG CAA AGA 793 
Trp Ala Gly Lys Gly Leu Leu Gly Asn He He Pro Leu Gin Arg 
210 215 220 

TGATTGTOOC TGQGGAACAG TAACAGGAAA GCATCTTCCC TTGOXTGGA CTTGGGTCTA 853 

GGGATTTOCA ACTTGTCTTC TCTOOCTGAA GCATAAQGAT CIGGAAGAG6 CTTGTAACCT 913 

GAACTTCTGT GTQGTGGCAG TACTGTGGOC CAOCAGTGTA ATCTOOCTGG ATTAAGGCAT 973 

TCTTAAAAAC TTAGGCTTGG ajKJTTTCAC AAATTAGGOC AOGGCOCTAA ATAGGAATTC 1033 

CXJEGGATTGT GGGCAAGTGG GOGGAAGTTA TTCTGGCAGG TACT0GTGT6 ATTATTATTA 1093 

TTATTTTTAA TT^T^AGAGTTT TACAGTGCT6 ATATG 1128 



(2) INroFMATIC»I FOR SBQ ID N0:19: 

(i) SEQUENCE CHARACTEEOSTICS: 

(A) I£NGTH: 506 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESOUIPTION: SBQ ID bK):19: 



Met Ala Glu Ala Asp Phe Lys Met Val Ser Glu Pro Val Ala His Gly 
15 10 15 

Val Ala Glu Glu Glu Met Ala Ser Ser Thr Ser Asp Ser Gly Glu Glu 
20 25 30 



Ser Asp Ser Ser Ser Ser Ser Ser Ser Thr Ser Asp Ser Ser Ser Ser 
35 40 45 
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Ser Ser Thr Ser Gly Ser Ser Ser Gly Ser Gly Ser Ser Ser Ser Ser 
50 55 60 

S6r Gly Ser Thr Ser Ser Arg Ser Arg lieu Tyr Arg Lys Lys Arg Val 
65 70 75 80 

Pro Glu Pro Ser Arg Arg Ala Arg Arg Ala Pro Leu Gly Thr Asn Phe 

85 90 95 

Val Asp Arg Leu Pro Gin Ala Val Arg Asn Arg Val Gin Ala Leu Arg 
100 105 110 

Asn lie Gin A^ Glu Cys Asp Lys Val Asp Thr Leu Phe Leu Lys Ala 
115 120 125 

lie His Asp Leu Glu Arg Lys Tyr Ala Glu Leu Asn Lys Pro Leu Tyr 
130 135 140 

Asp hcg Arg Phe Gin lie He Asn Ala Glu Tyr Glu Pro Thr Glu Glu 
145 150 155 160 

Glu cys Glu Trp Asn Ser Glu Asp Glu Glu Phe Ser Ser Asp Glu Glu 
165 170 175 

Val Gin Asp Asn Thr Pro Ser Glu Met: Pro Pro Leu Glu Gly Glu Glu 
180 185 190 

Glu Glu Asn Pro Lys Glu Asn Pro Glu Val Lys Ala Glu Glu Lys Glu 
195 200 205 

Val Pro Lys Glu lie Pro Glu Val Lys Asp Glu Glu Lys Glvt Val Ala 
210 215 220 

Lys Glu lie Pro Glu Val Lys Ala Glu Glu Lys Ala Asp Ser Lys Asp 
225 230 235 240 

Cys Met: Glu Ala Thr Pro Glu Val Lys Glu Asp Pro Lys Glu Val Pro 
245 250 255 

Gin Val Lys Ala Asp Asp Lys Glu Gin Pro Lys Ala Thr Glu Ala Lys 
260 265 270 

Ala Arg Ala Ala Val Pccg Glu Thr His Lys Arg Val Pro Glu Glu Arg 
275 280 285 

Leu Arg Asp Ser Val Asp Leu Lys Arg Ala Arg Lys Gly Lys Pro Lys 
290 295 300 

Arg Glu Asp Pro Lys Gly He Pro Asp Tyr Trp Leu lie Val Leu Lys 
305 310 315 320 
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Asn Val Asp Lys Leu Gly Pro Met lie Gin Lys Tyr Asp Glu Pro lie 

325 330 335 

Leu Fhe Leu Set* Asp Val Ser Leu Lys Phe Ser Lys Pro Gly GlK 
340 345 350 

Pro Val Ser Tyr 'Ftvr Phe Glu Phe His Phe Leu Pro Asm Pro Tyr Phe 
355 360 365 

Arg Asn Glu Val Leu Val Lys Thr Tyr lie lie Lys Ala Lys Pro Asp 
370 375 380 

His Asn Asp Pro Phe Phe Ser Trp Gly Trp GOLu lie Glu Asp Lys 
385 390 395 400 

Gly Cys Lys lie A^ Arg Arg hrg Gly Lys Asp Val Hir Val Thr Ttxr 
405 410 415 

Thr Gin Ser Arg Utrc Utac Ala thr Gly Glu lie Glu lie Gin Pro Arg 
420 425 430 

Val Val Pro Asn Ala Ser Phe Phe Asn Phe Phe Ser Pro Pro Glu lie 
435 440 445 

Pro Met lie Gly Lys Leu Glu Pro Azg Glu A^ Ala lie Leu Asp Glu 
450 455 460 

Asp Fhe Glu lie Gly Gin lie Leu His Asp Asn Val lie Leu li^s Ser 
465 470 475 480 

He 'Cyr Tyr Ty: Hxr Gly Glu Val Asn Gly Thr Tyr Tyr Gin Phe Gly 
485 490 495 

Lys His Tyr Gly Asn Lys Lys Tyr Arg Lys 
500 505 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1518 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA( genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
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ATGGCAGAAG CAGATTTTAA AATGGTCTOG GAAOCTGJTOG CXX:ATGGGGT TGCXX3AAGAG 60 

GAGAJTCGCTA GCTOGACTAG TGATTCTGGG GAAGAATCTG ACAGCAGfTAG CTCTAGCAGC 120 

AGCACTAGTG ACAGCAGCAG CAGCAGCAGC ACTAG?rGGCA GCAGCAGCXG CAGOGGCAGC 180 

AGCAGCAGCA GCAGOGGCAG CACTAGCAGC CCSCAGCXXSCT TGTATAGAAA GAAGAGGCTA 240 

OCTGAGCXJrT C3CAGAAGGGC GOGGOGGGOC OOGTTGQGAA CAAATTTOGT GGATAGGCTC 300 

OCTCA00CA6 TTAGAAATOG TGTGCAAGOG CTTAGAAACA TTCAAGATGA ATGTGACAAS 360 

GTAGATAOCC TGTTCTTAAA AGCAATTCAT GATCTTGAAA GAAAATATGC TGAACTCAAC 420 

AAGCX2rC?rGT ATGATAGGOG GTTTCAAATC ATCAATGCAG AATAOGAGCX: TACAGAAGAA 480 

GAATGTGAAT GGAATTCAGA GGATGAGGAG TTCAQCAGT6 ATGAOGAGGT GCAGGATAAC 540 

ACXXX:TAGT6 AAATGCCTCX: CTTAGAOOGT GAOGAAGAAG AAAACXX:TAA AGAAAAODCA 600 

GAGGTGAAAG CTGAAGAGAA GGAAGTTOCT AAAGAAATTC CTGAGGTGAA GGATGAAGAA 660 

AAOGAAGTTG CTAAAGAAAT TOCTGAGGTA AAGGCTGAAG AAAAAGCAGA TTCTAAT^GAC 720 

TGTAT0GA06 CAAO^OCTGA AGTAAAAGAA GATOCTAAAG AAGTCXXXXA GGTAAAOOCA 780 

GATGATAAAG AACAGOCTAA AGCAACAGAG GCTAAGGCAA GGGCTCCAGT AAGAGAGACT 840 

CATAAAAGAG TTOCTGAGGA AAGQCTTOGG GACAGTGTAG ATCTTAAAAG AGCTAGGAAG 900 

GGAAAGC3CTA AAAGAGAAGA CXXITAAAGGC ATTC3CTGACT ATTGGCTGAT •TCTTTTAAAG 960 

AATGTTGACA AGCTOGGGOC TATGATTCAG AAGTATGAT6 AGOOCATTCT GAAGTTCTTG 1020 

TOGGATGTTA GOCTGAAGTT CTCAAAACCT GGOCAGOCTG TAAGTTACAC ClTTGAATTT 1080 

C»TTTTCTAC CX^ACXXMA CTTCAGAAAT GAGGTGCTGG TGAA6ACATA TATAATAAAG 1140 

GCAAAAOCAG ATCACAATGA TCXXTTTCTTT TCTTGGGGAT GGGAAATTGA AGATTGCAAA 1200 

OGCTOCAAGA TAGAOCX9GAG AAGAGGAAAA GATGTTACTG TGACAACTAC OCAGAGTCX3C 1260 

ACAACTGCTA CTGGAGAAAT TGAAATCXIAG OCAAGAGTGG TTOCTAATGC ATCATTCTTC 1320 

AACTTCTTTA GTOCTOCTGA GATTOCTATG ATTGGGAAGC TGGAAOCAOG AGAAGATGCT 1380 

ATOCTGGATG AGGACTTTGA AATTGGGCAG ATTTTACATG ATAATGTCAT CXnxSAAATCA 1440 

ATCTATTACT ATACTGGAGA AG?rCAATGGT ACCTACTATC AATTTGGCAA ACATTATGGA 1500 
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AACAAGAAAT ACAGAAAA 1518 

(2) INFORMATION FOR SEQ ID NO: 21: *" 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2636 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA( genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(Vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Hiunan fetal brain cDNA library 

(B) CLONE: GEN-078D05 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 266.. 1783 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

GATTC0GCT6 OGGTACATCT GGOCACTCTA OCTOCAOOOG OGAGAOGOCT TGOOGOCACC 60 

GCTGTCGCOC AAGGCTCCAC TGCOOCTOOC ACCTCAGOGC CGOOCTCTOC ATCOGCAOCT 120 

CCAOCTCOGC TCTOOGOOGC TOCTGCCATC OOOGCTGOCA OCTOOOCAOC COGG G OCTOC 180 

GOOOOOOOCA OOCAAGCATC OGTGAGTCAT TTTCTGCOCA TCTCTGGTOG OGOGGTCTOC 240 

CTGGTAGAGT TTGTAGGCTT GCAAG ATG GCA GAA GCA GAT TTT AAA ATG GTC 292 

Met Ala Glu Ala Asp Phe Lys Met Val 
1 5 

TOG GAA OCT GTC GCC CAT GGG GTT GCC GAA GAG GAG ATG OCT AGO TOG 340 
Ser Glu Pro Val Ala His Gly Val Ala Glu Glu Glu Met Ala Ser Ser 
10 15 20 25 

ACT AGT GAT TOT GGG GAA GAA TCT GAG AGC AGT AGO TOT AGC AGO AGC 388 
Thr Ser Asp Ser Gly Glu Glu Ser Asp Ser Ser Ser Ser Ser Ser Ser 

30 35 40 

ACT AGT GAC AGC AGC AGC AGC AGC AGC ACT AGT OOC AGC AGC AGC OGC 436 
Thr Ser Asp Ser Ser Ser Ser Ser Ser Thr Ser Gly Ser Ser Ser Gly 
45 50 55 
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AGC GGC AGC AGC AGC AGC AGC AGC GGC AGC ACT AGC AGC OGC AGC CX3C 484 
Ser Gly Ser Ser Ser Ser Ser Ser Gly Ser Thr Ser Ser Arg Ser Arg 
60 65 70 

TTG TAT AGA AAG AAG AGG GTA OCT GAG OCT TOO AGA AGG GOG OGG OGG 532 
liBU Tyr Arg Lys Lys Arg Val Pro Glu Pro Ser Arg Arg Ala Arg Apg 
75 80 85 

GOC OOG TTG GGA ACA AAT TTC GTG GAT AGG CTG OCT GAG GCA GTT AGA 580 
Ala Pro Leu Gly Thr Asn Phe Val hsp Arg Leu Pro Gin Ala Val Arg 
90 95 • 100 105 

AAT OGT GTG CAA GOG CTT AGA AAC ATT CAA GAT GAA TGT GAC AAG GTA 628 
Asn Arg Val Gin Ala Leu Arg Asn lie Gin Asp Glu Oys Asp Lys Val 
110 115 120 

GAT ADC OTG TTC TTA AAA GCA ATT CAT GAT CTT GAA AGA AAA TAT GOT 676 
Asp Htw Leu Phe Leu Lys Ala lie His Asp Leu Glu Arg Lys Tyr Ala 
125 130 135 

GAA ore AAC AAG OCT CTG TAT GAT AGG OGG TTT CAA ATC ATC AAT GCA 724 
Glu Leu Asn Lys Pro Leu Tyr Asp Arg Arg Phe Gin lie lie Asn Ala 
140 145 150 

GAA TAG GAG OCT ACA GAA GAA GAA TGT GAA TGG AAT TCA GAG GAT GAG 772 
Glu Tyr Glu Pro Thr Glu Glu Glu Cys Glu Trp Asn Ser Glu Asp Glu 
155 160 165 

GAG TTC AGC AGT GAT GAG GAG GTG GAG GAT A7VC AOC OCT AGT GAA ATG 820 
Glu Phe Ser Ser Asp Glu Glu Val Gin Asp Asn Thr Pro Ser Glu Met 
170 175 180 ^- 185 

OCT OOC TTA GAG GGT GAG GAA GAA GAA AAC OCT AAA GAA AAC OCA GAG * 868 
Pro Pro Leu Glu Gly Glu Glu Glu Glu Asn Pro Lys Glu Asn Pro Glu 
190 195 200 

GTG AAA GOT GAA GAG AAG GAA GTT OCT AAA GAA ATT OCT GAG GTG AAG 916 
Val Lys Ala Glu Glu Lys Glu Val Pro Lys Glu He Pro Glu Val Lys 
205 210 215 

GAT GAA GAA AAG GAA GTT GCT AAA GAA ATT OCT GAG GTA AAG GCT GAA 964 
Asp Glu Glu Lys Glu Val Ala Lys Glu He Pro Glu Val Lys Ala Glu 
220 225 230 

GAA AAA GCA GAT TOT AAA GAC TGT ATG GAG GCA AOC OCT GAA GTA AAA 1012 
Glu Lys Ala Asp Ser Lys Asp Cys Met; Glu Ala Thr Pro Glu Val Lys 
235 240 245 

GAA GAT OCT AAA GAA GTC COC CAG GTA AAG GCA GAT GAT AAA GAA CAG 1060 
Glu Asp Pro Lys Glu Val Pro Gin Val Lys Ala i\sp Asp Lys Glu Gin 
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250 255 260 265 

OCT AAA GCA ACA GAG OCT AAG GCA AGG GCT GCA GTA AGA GAG ACT CAT 1108 
Pro Lys KLb. Hxr Glu Ala Lys Ala Arg Ala Ala Val Arg Glu Thr HiS 
270 275 280 

AAA AGA GTT CCT GAG GAA AGG CTT OGG GAC AGT GTA GAT CTT AAA AGA 1156 
lys Arg Val Pro Glu Glu Arg Leu Arg Asp Ser Val Asp lieu Lys Arg 
285 290 295 

GCT AGG AAG GGA AAG OCT AAA AGA GAA GAC OCT AAA GGC ATT OCT GAC 1204 
Ala Arg Lys Gly Lys Pro Lys Arg Glu Asp Pro Lys Gly He Pro Asp 
300 305 310 

TAT TGG CTG ATT GTT TTA AAG AAT GTT GAC AAG CTC GGG OCT ATG ATT 1252 
Tyr Trp Leu lie Val Leu Lys Asn Val Asp Lys Leu Gly Pro Mel: lie 
315 320 325 

GAG AAG TAT GAT GAG OOC ATT CTG AAG TTC TTG TOG GAT GTT AQC CTG 1300 
Gin Lys Tyr Asp Glu Pro He Leu Lys Pine Leu Ser Asp Val Ser Leu 
330 335 340 345 

AAG TTC TCA AAA OCT GGC CAG OCT GTA AGT TAG AGO TTT GAA TTT CAT 1348 
Lys Phe Ser Lys Pro Gly Gin Pro Val Ser Tyr Thr Hie Glu Phe His 
350 355 360 

TTT OTA OOC AAC OCA TAG TTC AGA AAT GAG GTG CTG GTG AAG /VGA TAT 1396 
Phe Leu Pro Asn Pro Tyr Phe Arg Asn Glu Val Leu Val Lys Thr Tyr 

365 370 375 

ATA ATA A7\G GCA AAA OCA GAT CAG AAT GAT OOC TTC TTT TCT TGG GGA 1444 
He He Lys Ala Lys Pro Asp His Asn Asp Pro Phe Phe Ser Trp Gly 
380 385 390 

TGG GAA ATT GAA GAT TGC AAA GGC TGG AAG ATA GAC OGG AGA AGA GGA 1492 
Trp Glu lie Glu Asp Cys Lys Gly Cys lys He Asp Arg Arg Arg Gly 
395 400 405 

7VAA GAT GTT ACT GTG AGA ACT AOC CAG AGT OGG AGA AGT GCT ACT GGA 1540 
Lys Asp Val Thr Val Thr Thr Thr Gin Ser Arg Thr Thr Ala Thr Gly 
410 415 420 425 

GAA ATT GAA ATC CAG OCA AGA GTG GTT OCT AAT GCA TCA TTC TTC AAG 1588 
Glu He Glu He Gin Pro Arg Val Val Pro Asn Ala Ser Phe Phe Asn 
430 435 440 

TTC TTT AGT OCT OCT GAG ATT OCT ATG ATT GGG AAG CTG GAA OCA OGA 1636 
Phe Phe Ser Pro Pro 6lu He Pro Mel: He Gly Lys Leu Glu Pro Arg 
445 450 455 
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GAA GAT GCT ATC CTG GAT GAG GAG TTT GAA ATT GGG GAG ATT TTA CAT 
Glu Asp Ala lie Leu Asp Glu Asp Phe Glu lie Gly Gin lie Leu His 
460 465 470 



1684 



GAT AAT GTC ATC CTG AAA TCA ATC TAT TAC TAT ACT GGA GAA GTC AAT 
Asp Asn Val lie Leu Lys Ser lie Tyr Tyr Tyr Thr Gly Glu Val Asn 
475 480 485 



1732 



GGT AOC TAC TAT CAA TTT GGC AAA CAT TAT GGA AAC AAG AAA TAC AGA 
Gly Thr Tyr Tyr Gin Phe Gly Lys His Tyr Gly Asn Lys 'Lys Tyr Arg 
490 495 - 500 505 



1780 



AAA TAAGTCAATC TGAAAGATTT TTCAAGAATC TTAAAATCTC AAGAAGTGAA 
Lys 



1833 



GCAGATTCAT ACAGOCTTGA AAAAAGTAAA AOCCTGAOCT GTAAOCTGAA CACTATTATT 1893 

OCTTATAGTC AAGTTTTTGT GGTTTCTTGG TAGTCTATAT TTTAAAAATA GTOCTAAAAA 1953 

GTGTCTAAGT GOCAGTTTAT TCTATCTAGG CTGTTGTAGT ATAATATTCT TCAAAATATG 2013 

TAAGCTGTTG TCAATTATCT AAAGCATGTT AGTTTGGTGC TACACAGTGT TGATTTTTGT 2073 

GATGTCCTTT GG?IX»TGTTT CTGTTAGACT GTAGCTGTGA AACTGTCAGA ATTGTTAACT 2133 

GAT^ACAAATA TTTGCTTGAA AAAATW^AGTT CATGAAGTAC CAATGCAAGT GTTTTATTTT 2193 

Tmcr'iTrr tocagoocat aagactaagg gtttaaatct gcttgcacta gctgtgoctt 2253 

CATTAGTTTG CTATAGAAAT OCAGTACTTA TAGTAAATAA AACAGTGTAT TTTGAAGTTT 2313 

GACTGCTTGA T^AAAGATTAG CATACATCTA ATGTGAAAAG AOCACATTTG ATTCAACTGA 2373 

GAOCTTGTGT ATGTGACATA TAGTGGOCTA TAAATTTAAT CATAATGATG TTATTGTTTA 2433 

CCACTGAOGT GTTAATATAA CATAGTATTT TTGAAAAAGT TTCTTCATCT TATATTGTGT 2493 

AATTGTAAAC TAAAGATAOC GTGTTTTCTT TGTATTGTGT TCTAOCTTOC CTTTCACTGA 2553 

AAATGATCAC TTCATTTGAT ACTGTTTTTC ATGTTCTTGT ATTGCAAOCT AAAATAAATA 2613 

AATATTAAAG TGTGTTATAC TAT 2636 

(2) INFORMATION FOR SBQ ID NO:22: 



(i) SEQUEircE CHARACTEEUSTICS: 

(A) LENGTH: 170 amino acdLds 

(B) TYPE: amixK) acid 
(D) TOPOLOGY: linear 
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(ii) M0LEX3JLE TYPE: protein 

(xi) SEQUENCE DESCMPTIWI: SBQ ID N0:22: 

Met Tlir Glu Leu Gin Ser Ala Leu Leu Leu Arg Arg Gin Leu Ala Glu 
15 10 15 

Leu Asn Lys Asn Pro Val Glu Gly Phe Ser Ala Gly Leu lie Asp Asp 
20 25 30 

Asn Asp Leu Tyr Arg Trp Glu Val Leu He He Gly Pro Pro Asp Thr 
35 40 45 

Leu Tyr Glu Gly Gly Val Mie Lys Ala His Leu Thr Phe Pro Lys Asp 
50 55 60 

Tyr Pro Leu Arg Pro Pro Lys Met Lys Phe lie Thr Glu He Trp His 
65 70 75 80 

Pro TVsn Val Asp Lys Asn Gly Asp Val Cys He Ser He Leu His Glu 

85 90 95 

Pro Gly Glu Asp Lys Tyr Gly T^ Glu Lys Pro Glu Glu Poog Trp Leu 
100 105 110 

Pro He His Thr Val Glu Thr He Met He Ser Val He Ser Met Leu 
115 120 125 

Ala Asp Pro Asn Gly Asp Ser Pro Ala Asn Val Asp Ala Ala Lys Glu 
130 135 140 

Trp Arg Glu Asp Arg Asn Gly Glu Phe Lys Arg Lys Val Ala Arg Cys 
145 150 155 160 

Val Arg Lys Ser Gin Glu Thr Ala Phe Glu 
165 170 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 510 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA( genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
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ATGAOOGAOC TGCAGTOOOC ACTOCTACTG CX3AAGACA0C TGOCAGAACT CAACAAAAAT 60 

OCAGIGGAAG GCTTTTCTGC AGGTTTAATA GATGACAATG ATCTCTACXX3 ATGGGAAGTC 120 

CTTATTATTG ODOCTOCAGA TACACTTTAT GAAGGrTGGTG TTTTTAAGGC TCATCTTACT 180 

TTG0CAAAA8 ATTATCXXCT COSAOCTOCT AAAATGAAAT TCATTACAGA AATCTGGCAC 240 

CXIAAATGTTG ATAAAAAT06 TGATGTGTGC ATTTCTATTC TTCATGAGOC TGGGGAAGAT 300 

AAGTATOGTT ATGAAAAOOC AGAGGAAOGC TGGCTOOCTA TOCACACTGT GGAAAOCATC 360 

ATGATTAGTG TCATTTCTAT GCTGGCAGAC OCTAATGGAG ACTCACCTGC TAATGTTGAT 420 

0CT0CX3AAA6 AATGGAOGGA AGATAGAAAT OGAQAATTTA AAAGAAAAGT TGOGOGCTGT 480 

GTAAGAAAAA OOCAAQAGAC TGCTTTTGAG 510 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 617 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA( genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Human fetal brain cDNA library 

(B) CLONE: GEN-423A12 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 19.. 528 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



GGGOOCTCGG CAQGGAQG ATG AOG GAG CTG CAG TOG GCA CTG CTA CTG OGA 51 

Met Thr Glu Leu Gin Ser Ala Leu Leu Leu Arg 
15 10 

AGA CAG CTG GCA GAA CTC AAC AAA AAT OCA GTG GAA GGC TTT TCT GCA 99 



1 
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Arg Gin Leu Ala Glu Leu Asn Lys Asn Pro Val Glu Gly Phe Ser Ala 
15 20 25 

GGT TTA ATA GAT GAC AAT GAT CTC TAG CGA TGG GAA GTC CTT ATT ATT 147 
Gly Leu lie Asp Asp Asn Asp Leu Tyr Arg Trp Glu Val Leu lie lie 
30 35 40 

QGC CX:T CCA GAT ACA CTT TAT GAA GGT GGT GTT TTT AAG GCT CAT CTT 195 
Gly Pro Pro Asp Thr Leu Tyr Glu Gly Gly Val Phe Lys Ala His Leu 
45 50 55 

ACT TTC CCA AAA GAT TAT CCC CTC CGA CCT CCT AAA ATG AAA TTC ATT 243 
Thr Phe Pro Lys Asp Ty^:- Pro Leu Arg Pro Pro Lys Met Lys Phe He 
60 65 70 75 

ACA GAA ATC TGG CAC CCA AAT GTT GAT AAA AAT GGT GAT GTG TGC ATT 291 
Thr Glu He Trp His Pro Asn Val Asp Lys Asn Gly Asp Val Cys He 

80 85 90 

TCT ATT CTT CAT GAG OCT GGG GAA GAT AAG TAT GGT TAT GAA AAG OCA 339 
Ser He Leu His Glu Pro Gly Glu Lys Tyr Gly Tyr Glu Lys Pro 
95 100 105 

GAG GAA CGC TGG CTC OCT ATC CAC ACT GTG GAA AOC ATC ATG ATT AGT 387 
Glu Glu Arg Trp Leu Pro He His Thr Val Glu Thr He Met He Ser 
110 115 120 

GTC ATT TCT ATG CTG GCA GAC CCT AAT GGA GAC TCA OCT GCT AAT GTT 435 
Val He Ser Met Leu Ala Asp Pro Asn Gly Asp Ser Pro Ala Asn Val 
125 130 135 

GAT GCT GCG AAA GAA TOG AOG GAA GAT AGA AAT GGA GAA TTT AAA AGA 483 
Asp Ala Ala Lys Glu Trp Arg Glu Asp Arg Asn Gly Glu Phe Lys Arg 
140 145 150 155 

AAA GTT GOC CGC TGT GTA AGA AAA AGC CAA GAG ACT GCT TTT GAG 528 
Lys Val Ala Arg Cys Val Arg Lys Ser Gin Glu Thr Ala Phe Glu 
160 165 170 

TGACATTTAT TTAGCAGCTA GTAACTTCAC TTATTTCAGG GTCTCCAATT GAGAAACATG 588 

GCACTGTTTT TCCTGCACTC TAOOCACOG 617 



(2) INFORMATION FOR SBQ ID N0:25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 374 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



-138- 

(ii) MOLECULE TYPE: protedLn 

(xi) SEQUENCE DESCRIPTICM: SEQ ID NO: 25: 



Met Val Leu Trp Glu Ser Pro Arg Gin cys Ser Ser Trp Uxr Leu Cys 
15 10 15 

Glu Gly Phe Cys Trp Leu Leu Leu Leu Pro Val Met Leu Leu lie Val 
20 25 30 

Ala Arg Pro Val Lys Leu Ala Ala Phe Pro Thr Ser Leu Ser Asp Cys 
35 40 45 

Gin Thr Pro Ttir Gly Trp Asn Cys Ser Gly Tyr Asp Asp Arg Glu Asn 
50 55 60 

Asp Left! Phe Leu Asp Thr Asn Thr Lys Hie Asp Gly Glu 
65 70 75 80 

Leu Arg lie Gly Asp Thr Val llir Val Cys Gin Phe Lys Cys Asn 

85 90 95 

Asn Asp ^I^r Val Pro Val Cys Gly Ser Asn Gly Glu Ser Tyr Gin Asn 
100 105 110 

Glu Cfs Tyr Leu Arg Gin Ala Ala Cys Lys Gin Gin Ser Glu lie Leu 
115 120 125 

Val Val Ser Glu Gly Ser Ala Thr Asp Ala Gly Ser Gly Ser Gly 
130 135 140 

Asp Gly Val His Glu Gly Ser Gly Glu Thr Ser Gin Lys Glu Thr Ser 
145 150 155 160 

Thr cys Asp lie Cys Gin Phe Gly Ala Glu Cys Asp Glu Asp Ala Glu 
165 170 175 

Asp Val Trp Cys Val Cys Asn lie Asp Cys Ser Gin Thr Asn Phe. Asn 
180 185 190 

Pro Leu Ala Ser Asp Gly Lys Ser Tyr Asp Asn Ala Cys Gin lie 
195 200 205 

Lys Glu Ala Ser Cys Gin Lys Gin Glu Lys lie Glu Val Met Ser Leu 
210 215 220 

Gly Arg Cys Gin Asp Asn Thr Thr Thr Thr Thr Lys Ser Glu Asp Gly 
225 230 235 240 
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His Tyr Ala Arg Thr Asp Tyr Ala Glu Asn Ala Asn Lys Leu Glu Glu 
245 250 255 

Ser Ala Arg Glu His His lie Pro Cys Pro Glu His Tyr Asn Gly fhe 
260 265 270 

Cys Met His Gly Lys Cys Glu His Ser lie Asn Met Gin Glu Pro Ser 
275 280 285 

Cys Arg Cys Asp Ala Gly Tyr Thr Gly Gin His Cys Glu Lys Lys Asp 
290 . 295 300 

Tyr Ser Val Leu Tyr Val Val Pro Gly Pro Val Arg Rie Gin Tyr Val 
305 310 315 320 

Leu lie Ala Ala Val lie Gly Thr He Gin He Ala Val He Cys Val 
325 330 335 

Val Val Leu Cys He Hit Arg Lys Cys Pro Arg Asn Arg He His 
340 345 350 

Arg Gin Lys Gin Asn Thr Gly His Tyr Ser Ser Asp Asn Thr Thr Arg 
355 360 365 

Ala Ser Hvt Arg Leu He 
370 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1122 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA( genomic ) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 



ATGGTGCTGT GGGAGTOCOC GCGGCAGTGC AGCAGCTGGA CACTTTGOGA GGQCTTTTGC 60 

TGGCTGCTGC TGCTGOCOGT CATGCTACTC ATOGTAGOOC GCOOGGTGAA GCTCGCTGCT 120 

TTCOCTAOCT CCTTAAGTGA CTOOCAAAOG COCAOOGGCT GGAATTGCTC TGGTTATGAT 180 

GACAGAGAAA ATGATCTCTT OCTCTGIGAC ACCAACAOCT GTAAATTT6A TOOOGAATGT 240 

TTAAGAATTG GAGACACTGT GACTTGOGTC TGICAGTTCA AGTOCAACAA TGACTATGTG 300 
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OCrrGTGTGTG GCTOCAATGG OGAGAOCTAC CAGAATGAGT GTTAOCTGOG ACAGGCrTGCA 360 

TOCAAACAOC AGAGTGAGAT ACTTGTOGTG TCAGAAOGAT CATGrOCXIAC AGATGCAOGA 420 

TCAGGATCTG GAGATOGAGT OCATGAAGGC TCTGGAGAAA CTAGTCAAAA GGAGACATOC 480 

ACCTGTGATA TTTGCX»GTT TGGTGCAGAA TGTGAOGAAG ATGOOGAGGA TGTCTGGTGT 540 

GTGTGTAATA TTGACTGTTC TCAAAOCAAC TTCAATOOCX: TCTGOGCTTC TGATGGGAAA 600 

TCTTATGATA ATGCATGOCA AATCAAAGAA OCATOGTGTC AGAAACAOGA GAAAATTGAA 660 

GTCATGTCTT TGGGTOGATG TCAAGATAAC ACAACTACAA CTACTAAGTC TGAAGATGGG 720 

CATTATOCAA GAACAGATTA TGCAGAGAAT GCTAACAAAT TAGAAGAAAG TOOCAGAGAA 780 

CADCACATAC CTTGTOOGGA ACATTACAAT GGCTTCTQCA TQCATGGGAA GTGTGAGCAT 840 

TCTATCAATA TQCSVOGAGCX: ATCTTOCAOG TGTGATGCTG GTTATACIGG ACAACACTGT 900 

GAAAAAAAGG ACTACAGTGT TCTATAOGTT GTKXXXSCSTC CTGTAOGATT TCAGTATGTC 960 

TTAATOSCAG CrGTGATTGG AACAATTCAG ATTGCTGTCA TCTGTGTGGT GGTOCTCTGC 1020 

ATCACAAOGA MOGOCCCMS AAGCAACAGA ATTCACAGAC AGAAGCAAAA TACAOGGCAC 1080 

TACAGTTCA6 ACAATACAAC AAGAOOGTOC ACGAGGTTAA TC 1122 

(2) INFORMATION FOR SEQ ID NO: 27: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1721 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA( genomic) 
(111) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(vli) IMMEDIATE SOURCE: 

(A) LIBRARY: Human fetal brain cDNA library 

(B) CLONE: GEN-092E10 



(Ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 368.. 1489 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

CTGaSGQGCG CCTTGACTCT OOCTCCAOCC TGCCTCCTOG GGCTOCACTC GTCTGOCXXTT 60 

GGACTOOOGT CTOCrrCCrGT OCTOGQGCTT OCCAGAGCTC CCTOCTTATG GCAGCAGCTT 120 

GCOGOGTCTC CGGOGCAQCT TCTCAGGQGA OGAOOCTCTC GCTOOGGGGC TGAGCCAGTC 180 

OCTGGATGTT GCTGAAACTC TOGAGATCAT GOGOGGGTTT GGCTGCTGCT TOOOOGOOGG 240 

GTQCCACTGC CAOOGOCGCC GOCTCTGCTG OOQOOGTOOG OGGGATGCTC AGTAGOOCGC 300 

TGOOOGGOOC COGOGATOCT GTGTTOCIOG GAAGCOGTTT GCTGCTOCAG AGTTGCAOGA 360 

ACTAGTC ATG GTG CTG TGG GAG TCC OCX5 CGG CAG TGC AGC AGC TGG ACA 409 
Met Val Leu Trp Glu Ser Pro Arg Gin Cys Ser Ser Trp Thr 
15 10 

CTT TGC GAG GGC TTT TGC TGG CTG CTG CTG CTG CCC GTC ATG CTA CTC 457 
Ifiu Cys Glu Gly Hie Qys Trp Leu Leu Leu Leu Pro Val Met Leu Leu 
15 20 25 30 

ATC GTA GCC CGC COG GTG AAG CTC GCT GCT TTC CCT ACC TCC TTA AGT 505 
lie Val Ala Arg Pro Val Lys Leu Ala Ala Phe Pro Thr Ser Leu Ser 

35 40 45 

GAC TGC CAA AOG CCC ACC GGC TGG AAT TGC TCT GGT TAT GAT GAC AGA 553 
Asp Qfs Gin Thr Pro Thr Gly Trp Asn Cys Ser Gly Tyr Asp Asp Arg 
50 55 60 

GAA AAT GAT CTC TTC CTC TGT GAC AOC AAC AOC TGT AAA TTT GAT GGG 601 
Glu Asn Asp Leu Phe Leu Qys Asp Utar Asn Thr Cys Lys Phe Asp Gly 
65 70 75 

GAA TGT TTA AGA ATT GGA GAC ACT GTG ACT TGC GTC TGT CAG TTC AAG 649 
Glu Cys Leu Arg lie Gly Asp Hir Val Thr Cys Val Cys Gin Phe Lys 
80 85 90 

TGC AAC AAT GAC TAT GTG CCT GTG TGT GGC TCC AAT GGG GAG AGC TAC 697 
Cys Asn Asn Asp Tyr Val Pro Val Cys Gly Ser Asn Gly Glu Ser Tyr 
95 100 105 110 

CAG AAT GAG TGT TAC CTG CGA CAG GCT GCA TGC AAA CAG CAG AGT GAG 745 
Gin Asn Glu Cys Tyr Leu Arg Gin Ala Ala Lys Gin Gin Ser Glu 
115 120 125 



ATA CTT GTC GTG TCA GAA GGA TCA TGT GCC ACA GAT GCA GGA TCA GGA 
He Leu Val Val Ser Glu Gly Ser Cys Ala Thr Asp Ala Gly Ser Gly 
130 135 140 



793 
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TCT GX3A GAT GGA GTC CAT GAA GGC TCT GGA GAA ACT ACT CAA AAG GAG 841 
Ser Gly Asp Gly Val His Glu Gly Ser Gly Glu Thr Ser Gin Lys Glu 
145 150 155 

ACA TCC ACC TGT GAT ATT TGC CAG TTT GGT GGA GAA TGT GAC GAA GAT 889 
Thr Ser Thr Cys Asp He Cys Gin Phe Gly Ala Glu Cys Asp Glu Asp 
160 165 170 

GOC GAG GAT GTC TGG TGT GTG TGT AAT ATT GAC TGT TCT CAA ACC AAC 937 
Ala Glu Asp Val Trp Cys Val Cys Asn He Asp Cys Ser Gin Thr Asn 
175 180. 185 190 

TTC AAT COC CTC TGC OCT TCT GAT GGG AAA TCT TAT GAT AAT GCA TGC 985 
Phe Asn Pro Leu Cys Ala Ser Asp Gly Lys Ser Tyr 7>isp Asn Ala Cys 
195 200 205 

CAA ATC AAA GAA GCA TGG TGT CAG AAA CAG GAG AAA ATT GAA GTC ATC 1033 
Gin He Lys Glu Ala Ser Cys Gin Lys Gin Glu Lys He Glu Val Met 
210 215 220 

TCT TTG GGT CGA TGT CAA GAT AAC 7VCA ACT ACA ACT ACT AAG TCT GAA 1081 
Ser Leu Gly Arg Gin Asp Asn Thr Thr Thr Thr Thr Lys Ser Glu 
225 230 235 

GAT GGG CAT TAT GCA AGA ACA GAT TAT GCA GAG AAT GCT T^kAC AAA TTA 1129 
Asp Gly His Tyr Ala Arg Thr Asp Tyr Ala Glu Asn Ala Asn Lys Leu 
240 245 250 

GAA GAA ACT GCC AGA GAA CAC CAC ATA OCT TGT COG GAA CAT TAC AAT 1177 
Glu Glu Ser Ala Arg Glu His His He Pro Cys Pro Glu His Tyr Asn 
255 260 265 ^ 270 

GGC TTC TGC ATG CAT GGG AAG TGT GAG CAT TCT ATC AAT ATG CAG GAG 1225 
Gly Phe Cys Met His Gly Lys Cys Glu His Ser He Asn Met Gin Glu 
275 280 ■ 285 

OCA TCT TGC AGG TGT GAT GCT GGT TAT ACT GGA CAA CAC TGT GAA AAA 1273 
Pro Ser Cys Arg Cys Asp Ala Gly Tyr Thr Gly Gin His Cys Glu Lys 
290 295 300 

AAG GAC TAC AGT GTT CTA TAC GTT GTT OCC GGT OCT GTA CGA TTT CAG 1321 
Lys Asp Tyr Ser Val Leu Tyr Val Val Pro Gly Pro Val Arg Phe Gin 
305 310 315 

TAT GTC TTA ATC GCA GCT GTG ATT GGA ACA ATT CAG ATT GCT GTC ATC 1369 
Tyr Val Leu He Ala Ala Val He Gly Thr He Gin He Ala Val He 
320 325 330 

TCT GTG GTG GTC CTC TGC ATC ACA AGG AAA TGC OCC AGA AGO AAC AGA 1417 
Cys Val Val Val Leu Cys He Thr Arg Lys Cys Pro Arg Ser Asn Arg 
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335 340 345 350 

ATT CAC AGA CAG AAG CAA AAT ACA GGG CAC TAG AGT TCA GAC AAT ACA 1465 
lie His Arg Gin Lys Gin Asn Thr Gly His Tyr Ser Ser Asp Asn Thr 
355 360 365 

ACA AGA CCG TOG AOG AGG TTA ATC TAA AGGGAGCATG TTTCACAiGTG 1512 
Thr Arg Ala Ser Thr Arg Leu lie 
370 

GCTGGACTAC OGAGAGCTT6 GACTACACAA TACAGTATTA TAGACAAAAG AATAAGACAA 1572 

GAGATCTACA CATGTTGOCT TGCATTTGTG GTAATCTACA CX:AATGAAAA CATGTACTAC 1632 

AGCTATATTT GATTATGTAT GGATATATTT GAAATAGTAT ACATTGTCTT GATGTTTTTT 1692 

CTGTAATGTA AATAAACTAT TTATATCAC 1721 



(2) INTORMATIGN FC» SBQ ID NO:28: 

(i) SEQUENCE CEiARACTEE^STICS: 

(A) LENGTH: 817 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOr^X^ULE TYPE: protein 

(xi) SEQUENCE DESCEttPTION: SEQ ID N0:28: 



Met: Gly Asp Vtxr Val Val Glu Pro Ala Pro Leu Lys Pro Thr Ser Glu 
1 5 10 15 

Pro Thr Ser Gly Pro Pro Gly T^sn Asn Gly Gly Ser Leu Leu Ser Val 
20 25 30 

lie Thr Glu Gly Val Gly Glu Leu Ser Val lie 7\sp Pro Glu Val Ala 
35 40 45 

Gin Lys Ala Cys Gin Glu Val Leu Glu Lys Val Lys Leu Leu His Gly 
50 55 60 

Gly Val Ala Val Ser Ser Arg Gly Thr Pro Leu Glu Leu Val Asn Gly 
65 70 75 80 

Asp Gly Val Asp Ser Glu lie Arg Cjrs Leu Asp Asp Pro Pro Ala Gin 

85 90 95 

lie Arg Glu Glu Glu Asp Glu Met: Gly Ala Ala Val Ala Ser Gly Thr 
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100 105 110 

Ala Lys Gly Ala Arg Arg Arg Arg Gin Asn Asn Ser Ala Lys Gin Ser 
115 120 125 

Txp Leu Leu Arg Leu Phe Glu Ser Lys Leu Phe Asp lie Ser Met Ala 
130 135 140 

He Ser Tyr Leu Tyr Asn Ser Lys Glu Pro Gly Val Gin Ala Tyr He 
145 150 155 160 

Gly Asn Arg Leu Phe Cys Rie Arg Asn Glu Asp Val Asp Kie Tyr Leu 
165 ' 170 175 

Pro Gin Leu Ij&a Asn Mel: Ty^^ lie His Met Asp Glu Asp Val Gly Asp 
180 185 190 

Ala lie Lys Pro Tyr He Val His Arg Arg Gin Ser He Asn Phe 
195 200 205 

Ser Leu Gin Ala Leu Leu Leu Gly Ala Tyr Ser Ser Asp Met His 
210 215 220 

He Ser Hxc Gin Arg His Ser Arg Gly Thr Lys Leu Arg Lys Leu He 
225 230 235 240 

Leu Ser Asp Glu Leu Lys Pro Ala His Arg Lys Arg Glu Leu Pro Ser 
245 250 255 

Leu Ser Pro Ala Pro Asp Thr Gly Leu Ser Pro Ser Lys Arg Thr His 
260 265 27C>» 

Gin Arg Ser Lys Ser Asp Ala Thr Ala Ser He Ser Leu Ser Ser Asn 
275 280 285 

Leu Lys Arg Thr Ala Ser Asn Pro Lys Val Glu Asn Glu Asp Glu Glu 
290 295 300 

Leu Ser Ser Ser Thr Glu Ser He Asp Asn Ser Phe Ser Ser Pro Val 
305 310 315 320 

Arg Leu Ala Pro Glu Arg Glu Phe He Lys Ser Leu Met Ala He Gly 
325 330 335 

Lys Arg Leu Ala Thr Leu Pro Thr Lys Glu Gin Lys Thr Gin Arg Leu 
340 345 350 

He Ser Glu Leu Ser Leu Leu Asn His Lys Leu Pro Ala Arg Val Trp 
355 360 365 
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Leu Pro Thr Ala Gly Ptte Asp His His Val Val Arg Val Pro His Thr 
370 375 380 

Gin Ala Val Val Leu Asn Ser Lys Asp Lys Ala Pro Tyr Leu lie 1!^ 
385 390 395 400 

Val Glu Val Leu Glu Glu Asn Phe Asp Thr Thr Ser Val Pro Ala 
405 410 415 

Arg lie Pro Glu Asn Arg lie Arg Ser Thr Arg Ser Val Glu Asn Leu 
420 425 430 

Pro Glu Cys Gly lie Thr His Glu Gin Arg Ala Gly Ser Phe Ser Thr 
435 440 445 

Val Pro Asn Tyr Asp Asn Asp Asp Glu Ala Trp Ser Val Asp Asp lie 
450 455 460 

Gly Glu Leu Gin Val Glu Leu Pro Glu Val His Ttir Asn Ser Asp 
465 470 475 480 

Asn lie Ser Gin Phe Ser Val Asp Ser lie Thr Ser Gin Glu Ser Lys 
485 490 495 

Glu Pro Val Fh& lie Ala Ala Gly Asp He Arg Arg Arg Leu Ser Glu 
500 505 510 

Gin Leu Ala His Thr Pro Thr Ala Phe Lys Arg Asp Pro Glu Asp Pro 
515 520 525 

Ser Ala Val Ala Leu Lys Glu Pro Trp Gin Glu Lys Val Arg Arg He 
530 535 540 

Arg Glu Gly Ser Pro Tyr Gly His Leu Pro Asn Trp Arg Leu Leu Ser 
545 550 555 560 

Val He Val Lys Cys Gly Asp Asp Leu Arg Gin Glu Leu Leu Ala Phe 
565 570 575 

Gin Val Leu Lys Gin Leu Gin Ser He Trp Glu Gin Glu Arg Val Pro 
580 585 590 

Leu Trp He Lys Pro He Gin Asp Ser Glu He Thr Thr Asp Ser 
595 600 605 

Gly Met; He Glu Pro Val Val Asn Ala Val Ser He His Gin Val Lys 
610 615 620 

Lys Gin Ser Gin Leu Ser Leu Leu Asp Tyr Phe Leu Gin Glu His Gly 
625 630 635 640 
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Ser Tyr Thr Thr Glu Ala Phe Leu Ser Ala Gin Arg Asn Phe Val Gin 
645 650 655 

Ser Cys Ala Gly Tyr Cys Leu Val Cys Tyr Leu Leu Gin Val Lys Asp 
660 665 670 

Arg His Asn Gly Asn lie Leu Leu Asp Ala Glu Gly His lie lie His 
675 680 685 

lie Asp Phe Gly Phe lie Leu Ser Ser Ser Pro Arg Asn Leu Gly Phe 
690 .695 700 

Glu Hhr Ser Ala R*e Lys Leu Thr Ttir Glu R» Val Asp Val Met Gly 
705 710 715 720 

Gly Leu Asp Gly Asp Met Phe Asn Tyr Ty^^ Lys Mat Leu Met Leu Gin 
725 730 735 

Gly Leu He Ala Ala Arg Lys His Met Asp Lys Val Val Gin lie Val 
740 745 750 

Glu lie Met Gin Gin Gly Ser Gin Leu Pro C^s Rie His Gly Ser Ser 
755 760 765 

Thr He Arg Asn Leu Lys Glu Arg Pte His Met Ser Met Thr Glu Glu 
770 775 780 

Gin Leu Gin Leu Leu Val Glu Gin Met Val Asp Gly Ser Met Arg Ser 
785 790 795 800 

lie Thr Thr Lys Leu Tyr Asp Gly Gin Tyr Leu Thr Asa Gly He 
805 810 815 

Met 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2451 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA( genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 



ATGGGAGATA CAGTAGTGGA GOCTGOOOOC TTGAAGOCAA CTTCTGAGOC CACTTCTGGC 
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OCAOCAGGGA ATAATGGGGG GTOOCTGCTA AGTGTCATCA CX3GAGGGGGT OGGGGAACTA 120 

TCAGTGATTG AOOCTGAGGT GGOCX»GAAG GOCTGOCAGG AGGTGTTGGA GAAAGTCAAG 180 

CTTTTGCATG GAGGOGTGGC AGTCTCTAGC AGAGGCACDOC CACTGGAGTT GGTCAATGGG 240 

GATGGTGTGG ACAGTGAGAT OOGTTGOCTA GATGATOCAC CTGCXXZAGAT CAGGGAGGAG 300 

GAAGATGAGA TGGGGGCXX3C TGTGGOCTCA GGCACAGOCA AAGGAGCAAG AAGAOGGOGG 360 

C»GAACAACT CAQCTAAACA GTCTTGGCTG CTGAGGCTGT TTGAGTCAAA ACTCTTTCAC 420 

ATCTOCATGG CX3VTTTCATA OCTGTATAAC TOCAAGQAGC CTGGAGTACA AGOCTACATT 480 

GGCAAOCX3GC TCTTCTGCTT TOGCAAOGAG GAOGTGGACT TCTATCTOCX: CXZAGTTGCTT 540 

AACATGTACA TCCACATGGA TGAGGAOGIG GGTQATGCXA TTAAGGCXn'A CATAGTOCAC 600 

OGTTGQCXXX: AGAGCATTAA CTTTTOOCTC CAGTGTGCXX: TGTTOCTTGG GGOCTATTCT 660 

OXaVGACATGC ACATTTOCAC TCAACGACAC TC0CX3TGGGA OCAAGCTAOG GAAGCTGATC 720 

CTCTCAGATG AGCTAAAGOC AGCTCACAGG AAGAGGGAGC TGCCCTXDCTT GAG0CXX3G0C 780 

OCTGATAiCAG GGCTGTCTOC CTOCAAAAGG ACTCAOCAGC GCTCTAAGTC AGATGOCACT 840 

GOCAGCATAA GTCTCAGCAG CAAGCTGAAA CGAACAGOCA GCAAOOCTAA AGTGGAGAAT 900 

GAGGATGAGG AGCHCTCXnC CAGCAOOGAG AGTATTGATA ATTCATTCAG TTCOOCTCTT 960 

OGACTGGCTC CTGAGAGAGA ATTCATCAAG TCOCTGATGG CX3ATOGGCAA GOGGCTGGOC 1020 

AOGCIXDOOCA OCAAAGAGCA GAAAACACAG AGGCTGATCT CAGAGCTCTC CX2IGCTCAAC 1080 

CATAAGCTOC CTGOOOGAGT CTGGCTGOOC ACTGCTGGCT TTGAOCAOCA OGTOGTOCX?! 1140 

GTAOSXACA CACAGGCTGT TGTOCTCAAC TOGAAGGACA AGGCTOCCTA OCTGATTTAT 1200 

GTGGAAGTOC TTGAATGTGA AAACTTTGAC AOCACCAGTG TCOJrOCCCG GATCXXXX5AG 1260 

AADCGAATTC GGAGTAOGAG GTOCCTAGAA AACTTGCXXXJ AATGTGGTAT TAOCXVVTGAG 1320 

CAGCGAGCTG GCAGCTTCAG CACTGTGCOC AACTATGACA AOGATGATGA GGOCTGGTOG 1380 

GTGGATGACA TAGGOGAGCT GCAAGTGGAG CTCOOOGAAG TGCATAOCAA CAGCTGTGAC 1440 

AACATCTCOC AGTTCTCTGT GGACAGCATC ACCAGOCAGG AGAGCAAGGA GOCTGTGTTC 1500 

ATTGCAGCAG GGGACATOOG CXXGOGOCTT TOGGAACAGC TGGCTCATAC OCXDGACAGCX: 1560 
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TTCAAAOGAG AOOCAGAAGA TOCTTCTGCA GTTGCTCTCA AAGAGCXXnxS GCAGGAGAAA 1620 

GTAOGOOOGA TCAGAGAGGG CTCXXXCTAC GOXATCJrOC CX»ATTGGOG GCTOCTGTCA 1680 

CTCATTCTCA AGTCTOGGGA TGAOCTTOGG CAAGAGCTTC TGGOCTTTCA GGTGTTGAAG 1740 

CAACTGCAG?r OCATTTGGGA ACAGGAGOGA GTGCXXXnTT GGATCAAGOC AATACAAGAT 1800 

TCTTGTGAAA TTAOSACTGA TAGTGGCATG ATTGAAOCAG TGGTCAATQC TGTCTOCATC 1860 

CATCAGGTGA AGAAACAGTC ACAGCTCTOC TTGCTOGATT ACITOCTACA OGAGCAOQGC 1920 

AGTTACAOCA CTGAGGCATT OCTCAGTOCA CAGOGCAATT TTGTGCAAAG TTCTGCTGGG 1980 

TACTGCTTGG TCTGCTAOCT GCTGCAAGTC AAGGACAGAC ACAATGGGAA TATOCTTTTG 2040 

GACGCAGAAG GCCACATCAT OCACATCGAC TTTGGCTTCA TOCTCTOCAG CTCAOOOOGA 2100 

AATCTGOOCT TTGAGAOGTC AOOCTTTAAG CIGAOCACAG AGTTTGTGGA TGTGATGGGC 2160 

GGOCTGGATG GOGACATGTT CAACTACTAT AAGATGCTGA TGCTGCAAGG GCTGATTGGC 2220 

GCTCGGAAAC ACATGGACAA GGTGGTGCAG ATOGTGGAGA TCATGCAGCA AGGTTCTCAG 2280 

CITOCTTGCT TOCATGGCTC CAGCAOCATT OGAAAOCTCA AAGAGAGGTT OCACATGAGC 2340 

ATGACTGAGG AGCAGCTGCA GCTGCTGGrG GAGCAGATGG TGGATGGCAG TATGCX3GTCT 2400 

ATCAOCAOCA AACTCTATGA CGQCTTOCAG TACCTCAOCA AC3GGCATCAT G 2451 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3602 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA( genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Human fetal brain cDNA library 

(B) CLONE: GEN-428B12c2 

(ix) FEATURE: 
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(A) NAME/KEY: CDS 

(B) LOCATION: 429.-2879 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 



GGTGGCTCAC GOCTGTAATC OCAGCACTTT GGGAGGACAA GGCAGATCOC TTCAGGOCAG 60 

GAGGTAGAGG CTGCAGTGAG CTGTGATGGT GCCACTGCAC TOCAGOCTGG GCAATGAAGC 120 

AAGftOOCTAT CTGAAAAAAA AAATTTTTAA AAAAGGCAAA GATOGOGCTG GGGCACCAAA 180 

TATTGCA6AG GAAAG06AAC GTGTGTACTC CTTGAGGTGG GGAACATGAC OCACTTCAGG 240 

TGCAGAAAGA AGACTTGTAT GGGGCTGGTG CAGOCTOCGC GQOOGCTGTC AGGGAflGGGC 300 

AGGCX3GGCAA TGGAAOXXSG GAGCGGTCGC TGCTGCTGAG GOGGCAGTGT OGGCAGTOCA 360 

ACCOOGACTG OOOQCAOOOC CICCGOGGGG TCOOOCAGAG CTTGGAAGCT CGAAGTCTGG 420 

CTGTGGGC ATG GGA GAT ACA GTA GTG GAG CCT GOC OCC TTG AAG CCA ACT 470 
Met Gly Asp Thr Val Val Glu Pro Ala Pro Leu Lys Pro Thr 
15 10 . 

TOT GAG COC ACT TCT GGC CCA OCA GGG AAT AAT GGG GGG TCC CTG CTA 518 
Ser Glu Pro Thr Ser Gly Pro Pro Gly Asn Asn Gly Gly Ser Leu Leu 
15 20 25 30 

AGT GTC ATC ACG GAG GGG GTC GGG GAA CTA TCA GTG ATT-QAC CCT GAG 566 
Ser Val lie Thr Glu Gly Val Gly Glu Leu Ser Val He Asp Pro Glu 

35 40 -* 45 

GTG GOC CAG AAG GOC TGC CAG GAG GTG TTG GAG AAA GTC AAG CTT TTG '614 
Val Ala Gin Lys Ala Cys Gin Glu Val Leu Glu Lys Val Lys Leu Leu 
50 55 60 " 

CAT GGA GGC GTG GCA GTC TCT AGO AGA GGC AOC CCA CTG GAG TTG GTC 662 
His Gly Gly Val Ala Val Ser Ser Arg Gly Thr Pro Leu Glu Leu Val 
65 70 75 

AAT GGG GAT GGT GTG GAC AGT GAG ATC CGT TGC CTA GAT GAT CCA CCT 710 
Asn Gly Asp Gly Val Asp Ser Glu He Arg Cys Leu Asp Asp Pro Pro 
80 85 90 

GOC CAG ATC AGG GAG GAG GAA GAT GAG ATG GGG GOC GOT GTG GOC TCA 758 
Ala Gin He Arg Glu Glu Glu Asp Glu Met Gly Ala Ala Val Ala Ser 
95 100 105 110 

GGC ACA GOC AAA GGA GCA AGA AGA CGG OGG CAG AAC AAC TCA OCT AAA 806 
Gly Thr Ala Lys Gly Ala Arg Arg Arg Arg Gin Asn Asn Ser Ala Lys 
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115 120 125 

CAG TCT TGG CTG CTG AGG CTG TTT GAG TCA AAA CTG TTT GAG ATC TCC 854 
Gin Ser Trp Leu Leu Arg Leu Phe Glu Ser Lys Leu Phe Asp lie Ster 
130 135 140 

ATG GOC ATT TCA TAG CTG TAT AAC TCC AAG GAG OCT GGA GTA CAA GOC 902 
Met Ala lie Ser Tyr Leu Tyr Asn Ser Lys Glu Pro Gly Val Gin Ala 
145 150 155 

TAG ATT GGC AAC OGG CTC .TTC TGC TTT OGC AAC GAG GAC GTG GAC TTC 950 
Tyr lie Gly Asn Arg Leu Phe Cys Phe Arg Asn Glu Asp Val Asp Phe 
160 165 170 

TAT CTG OOC CAG TTG CTT AAC ATG TAG ATC CAC ATG GAT GAG GAC GTG 998 
Tyr Leu Pro Gin Leu Leu Asn Met: Tyr lie His Met Asp Glu Asp Val 
175 180 185 190 

GGT GAT GOC ATT AAG OCC TAG ATA GTC CAC CGT TGC OGC CAG AGO ATT 1046 
Gly Asp Ala lie Lys Pro Tyr lie Val His Arg Cys Arg Gin Ser lie 
195 200 205 

AAC TTT TOO CTC CAG TGT GOC CTG TTG CTT GGG GOC TAT TOT TCA GAC 1094 
Asn Hie Ser Leu Gin Cys Ala Leu Leu Leu Gly Ala Tyr Ser Ser Asp 
210 215 220 

ATG CAC ATT TCC ACT CAA OGA CAC TOC OGT GGG AOC AAG CTA OGG AAG 1142 
Met His lie Ser Thr Gin Arg His Ser Arg Gly Thr Lys Leu Arg Lys 
225 230 235 

CTG ATC CTC TCA GAT GAG CTA /AG OCA GOT CAC AGG AAG AGG GAG CTG 1190 
Leu lie Leu Ser Asp Glu Leu Lys Pro Ala His Arg Lys Arg Glu Leu 
240 245 250 

OOC TOC TTG AGC COG GOC OCT GAT ACA GGG CTG TCT OOC TOC AAA AGG 1238 
Pro Ser Leu Ser Pro Ala Pro Asp Thr Gly Leu Ser Pro Ser Lys Arg 
255 260 265 270 

ACT CAC CAG OGC TCT AAG TCA GAT GOC ACT GOC AGC ATA AGT CTC AGC 1286 
Thr His Gin Arg Ser Lys Ser Asp Ala Thr Ala Ser lie Ser Leu Ser 
275 280 285 

AGC AAC CTG AAA CGA ACA GOC AGC 7»AC OCT AAA GTG GAG AAT GAG GAT 1334 
Ser Asn Leu Lys Arg Thr Ala Ser Asn Pro Lys Val Glu Asn Glu Asp 
290 295 300 

GAG GAG CTC TOC TOC AGC M^Z GAG AGT ATT GAT AAT TCA TTC AGT TOC 1382 
Glu Glu Leu Ser Ser Ser Thr Glu Ser lie Asp Asn Ser Phe Ser Ser 
305 310 315 
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OCT GTT CX3A CTG GCT OCT GAG AGA GAA TTC ATC AAG TOC CTG ATG GOG 1430 
Pro Val Arg Leu Ala Pro Glu Arg Glu Phe lie Lys Ser Leu Met Ala 
320 325 330 

ATC GGC AAG OGG CTG GOC AOG CTC OCC AOC AAA GAG GAG AAA ACA CAG 1478 
lie Gly Lys Arg teu Ala Thr Leu Pro Thr Lys Glu Gin Lys Thr Gin 
335 340 345 350 

AGG CTG ATC TCA GAG CTC TOO CTG CTC AAC CAT AAG CTC OCT GOC OGA 1526 
Arg Leu lie Ser Glu Leu Ser Leu Leu Asn His Lys Leu Pro Ala Arg 
355 360 365 

GTC TGG CTG GOC ACT GCT GGC TTT GAC CAC CAC GIG GTC OGT GTA OCC 1574 
Val Trp Leu Pro Thr Ala Gly Rie Asp His His Val Val Arg Val Pro 
370 375 380 

CAC ACA CAG GCT GTT GTC CTC AAC TOC AAG GAC AAG GCT OCC TAG CTG 1622 
His Hhr Gin Ala Val Val Leu Asn Ser Lys Asp Lys Ala Pro Tyc Leu 
385 390 395 

ATT TAT GTG GAA GTC CTT GAA TGT GAA AAC TTT GAC AOC AOC AGT GTC 1670 
lie Tyr Val Glu Val Leu Glu Cys Glu Asn Rie Asp Thr Thr Ser Val 
400 405 410 

OCT GOC OGG ATC OCC GAG AAC OGA ATT OGG AGT AOG AGG TOC GTA GAA 1718 
Pro Ala Arg lie Pro Glu Asn Arg lie Arg Ser Thr Arg Ser Val Glu 
415 420 425 430 

AAC TTG OOC GAA TGT GGT ATT AOC CAT GAG CAG OGA GCT GGC AGO TTC 1766 
Asn leu Pro Glu Cys Gly He llir His Glu Gin Arg Ala Gly Ser Phe 
435 440 - 445 

AGO ACT GTG OCC AAC TAT GAC AAC GAT GAT GAG GOC TGG TOG GTG GAT 1814 
Ser Thr Val Pro Asn Tyr Asp Asn Asp Asp Glu Ala Trp Ser Val Asp 
450 455 460 * 

GAC ATA GGC GAG CTG CAA GTG GAG CTC OOC GAA GTG CAT AOC AAC AGO 1862 
Asp He Gly Glu Leu Gin Val Glu Leu Pro Glu Val His Thr Asn Ser 
465 470 475 

TGT GAC AAC ATC TOC CAG TTC TOT GTG GAC AGO ATC AOC AGO CAG GAG 1910 
Cys Asp Asn He Ser Gin Phe Ser Val Asp Ser He Thr Ser Gin Glu 
480 485 490 

AGC AAG GAG OCT GTG TTC ATT GCA GCA GGG GAC ATC GGC OGG GGC CTT 1958 
Ser Lys Glu Pro Val Phe He Ala Ala Gly Asp He Arg Arg Arg Leu 
495 500 505 510 

TOG GAA CAG CTG GCT CAT AOC COG ACA GOC TTC AAA OGA GAC OCA GAA 2006 
Ser Glu Gin Leu Ala His Thr Pro Thr Ala Phe Lys Arg Asp Pro Glu 



-152- 



515 520 525 

GAT OCT TCT GCA GTT GCT CTC AAA GAG OOC TGG CAG GAG AAA OTA CX3G 2054 
Asp Pro Ser Ala Val Ala Leu Lys Glu Pro Trp Gin Glu Lys Val fttg 
530 535 540 

OGG ATC AGA GAG GGC TOC COC TAG GGC CAT CTC OOC AAT TOG OGG CTC 2102 
Arg lie Arg Glu Gly Ser Pro Tyr Gly His Leu Pro Asn Trp Arg Leu 
545 550 555 

CTC TCA CTC ATT GTC AAG.TGT GOG GAT GAC CTT OGG CAA GAG CTT CTC 2150 
teu Ser Val lie Val Lys Gly Asp Asp Leu Arg Gin Glu Leu Leu 
560 565 570 

GCC TTT CAG GTC TTC AAG CAA CTC CAG TOC ATT TOG GAA CAG GAG OGA 2198 
Ala Phe Gin Val Leu Lys Gin Leu Gin Ser lie Trp Glu Gin Glu J\rg 
575 580 585 590 

GTC OOC CTT TGG ATC AAG OCA ATA CAA GAT TCT TGT GAA ATT AOG ACT 2246 
Val Pro ifiu Trp lie Lys Pro lie Gin Asp Ser Glu lie Thr Thr 
595 600 605 

GAT AGT GGC ATC ATT GAA OCA GTC GTC AAT GCT GTC TOC ATC CAT CAG 2294 
Asp Ser Gly Met lie Glu Pro Val Val Asn Ala Val Ser lie His Gin 
610 615 620 

GTC AAG AAA CAG TCA CAG CTC TOC TTC CTC GAT TAG TTC OTA CAG GAG 2342 
Val Lys Lys Gin Ser Gin Leu Ser Leu Leu Asp Tyr Phe Leu Gin Glu 
625 630 635 

CAC GGC AGT TAG AOC ACT GAG GCA TTC CTC AGT GCA CAG OGC AAT TTT 2390 
His Gly Ser Tyr Thr Thr Glu Ala Phe Leu Ser Ala Gin Arg Asn Hie 
640 645 650 

GTC CAA AGT TGT GCT GGG TAG TGC TTC GTC TGC TAG CTC CTC CAA GTC 2438 
Val Gin Ser Cys Ala Gly Tyr Cys Leu Val C^ Tyr Leu Leu Gin Val 
655 660 665 670 

AAG GAC AGA CAC AAT GGG AAT ATC CTT TTC GAC GCA GAA GGC CAC ATC 2486 
Lys Asp Arg His Asn Gly Asn lie Leu Leu Asp Ala Glu Gly His lie 
675 680 685 

ATC CAC ATC GAC TTT GGC TTC ATC CTC TCC AGO TCA OOC OGA AAT CTC 2534 
lie His lie Asp Phe Gly Phe lie Leu Ser Ser Ser Pro Arg Asn Leu 
690 695 700 

GGC TTT GAG AOG TCA GOO TTT AAG CTC AOC ACA GAG TTT GTC GAT GTC 2582 
Gly Phe Glu Thr Ser Ala Phe Lys Leu Thr Ttir Glu Phe Val Asp Val 
705 710 715 
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ATG GGC GGC CTG GAT GGC GAC ATG TTC AAC TAG TAT AAG ATG CTG ATG 2630 
Met Gly Gly Leu Asp Gly Asp Met Rie Asn Tyr Tyr Lys Met Leu Met 
720 725 730 

CTG CAA GGG CTG ATT GOC GCT CGG AAA CAC ATG GAC AAG GTG GTG CAG 2678 
Leu Gin Gly Leu lie Ala Ala Arg Lys His Met Asp Lys Val Val Gin 
735 740 745 750 

ATC GTG GAG ATC ATG CAG CAA GGT TCT CAG CTT OCT TGC TTC CAT GGC 2726 
lie Val Glu lie Met Gin Gin Gly Ser Gin Leu Pro Cys Phe His Gly 
755 760 765 

TOC AGC AOC ATT CGA AAC CTC AAA GAG AGG TTC CAC ATG AGC ATG ACT 2774 
Ser Ser Thr lie Axg Asn Leu Lys Glu Arg Pt)& His Met Ser Met Thr 
770 775 780 

GAG GAG CAG CTG CAG CTG CTG GTG GAG CAG ATG GTG GAT GGC AGT ATG 2822 
Glu Glu Gin Leu Gin Leu Leu Val Glu Gin Met Val Asp Gly Ser Met 
785 790 795 

CGG TCT ATC AOC AOC AAA CTC TAT GAC GGC TTC CAG TAC CTC ACC AAC 2870 
Arg Ser lie Ihr Ihr Lys Leu Asp Gly Phe Gin Tyr Leu Thr Asn 
800 805 810 

GGC ATC ATG TGA CAOGCTOCTC AGOOCAGGAG TGGTGGGGGG TOCAGGQCAC 2922 

Gly lie Met * 

815 

OCTOOCTAGA GGGCOCTTGT CTGAGAAAOC OCAAAOCAGG AAAOOOCAOC TAOOCAAOCA 2982 

TOCACOCAAG GGAAATOGAA OOCAAGAAAC A0GAA06ATC ATGTOGTAACJFOOGAGAGCT 3042 

TGCTGAGGQG TGGGAGAGCC AGCTGTGGGG TOCAGACTTG TTGGGGCTTC OCTGOOOCTC 3102 

CTGGTCTGTG TCAGTATTAC CAOCAGACTG ACTOCAOGAC TCACTGOOCT CCAGAAAACA 3162 

GAGGTGACAA ATGTGAOOGA CACTGOOGGC TTTCTTCTOC TTGTAOOGGT CTCTCAGAGG 3222 

TTCTTTOCAC AGGOCATCCT CTTATTCOGT TCTGGGGCOC AGGAAGTGGG GAAGAGTAGG 3282 

TTCTOGGTAC TTAGGACTTG ATOCTGTGGT TGCCACTGGC CATGCTGCTG CCCAGCTCTA 3342 

OOCCTOOCAG GGAOCTAOOC CTOOCAGGGA OOGACOOCTG GOOCAAGCTC GOCTTGCTGG 3402 

OGGGOGCTGC GTGGGOOCTG CACTTGCTGA GGTTOOOCAT CATGGGCAAG GCAAGGGAAT 3462 

TOOCACAGOC CTOCAGTGTA CTGAGGGTAC TGGOCTAGCC ATGTGGAATT COCTAOOCTG 3522 

ACTOCTTCOC CAAAOOCAGG GAAAAGAGCT CTCAATTTTT TATTTTTAAT TTTTGTTTGA 3582 
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AATAAAGTOC TTAGTTAGCC 3602 

(2) INFORMATION FOR SEQ ID NO: 31: *• 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 829 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

Met Arg Phe Leu Glu Ala Arg Ser Leu Ala Val Ala Met Gly Asp Thr 
15 10 15 

Val Val . Glu Pro Ala Pro Leu Lys Pro Thr Ser Glu Pro Thr Ser Gly 
20 25 30 

Pro Pro Gly Asn Asn Gly Gly Ser Leu Leu Ser Val lie Thr Glu Gly 
35 40 45 

Val Gly Glu Leu Ser Val lie Asp Pro Glu Val Ala Gin Lys Ala Cys 
50 55 60 

Gin Glu Val Leu Glu Lys Val Lys Leu Leu His Gly Gly Val Ala Val 
65 70 75 80 

Ser Ser Arg Gly Thr Pro Leu Glu Leu Val Asn Gly 7^ Gly Val Asp 

85 90 95 

Ser Glu lie Arg Cys Leu Asp Asp Pro Pro Ala Gin lie Arg Glu Glu 
100 105 110 

Glu Asp Glu Met Gly Ala Ala Val Ala Ser Gly Thr Ala Lys Gly Ala 
115 120 125 

Arg Arg Arg Arg Gin Asn Asn Ser Ala Lys Gin Ser Trp Leu Leu Arg 
130 135 140 

Leu Phe Glu Ser Lys Leu Phe Asp lie Ser Met Ala lie Ser Tyr Leu 
145 150 155 160 

Tyr Asn Ser Lys Glu Pro Gly Val Gin Ala Tyr lie Gly Asn Arg Leu 
165 170 175 

Hie Cys Phe Arg Asn Glu Asp Val Asp Phe Tyr Leu Pro Gin Leu Leu 
180 185 190 
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Asn Met Tyr He His Met Asp Glu Asp Val Gly Asp Ala He Lys Pro 
195 200 205 

Tyr He Val His Arg Cys Arg Gin Ser He Asn Phe Ser Leu Gin Cys 
210 215 220 

Ala Leu Leu Leu Gly Ala Tyr Ser Ser Asp Met His He Ser Thr Gin 
225 230 235 240 

Arg His Ser Arg Gly Thr Lys Leu Arg Lys Leu He Leu Ser Asp Glu 
245 250 255 

Leu Lys Pro Ala His Arg Lys Arg Glu Leu Pro Ser Leu Ser Pro Ala 
260 265 270 

Pro Asp Thr Gly Leu Ser Pro Ser Lys Arg Thr His Gin Arg Ser Lys 
275 280 285 

Ser Asp Ala Thr Ala Ser He Ser Leu Ser Ser Asn Leu Lys Arg Thr 
290 295 300 

Ala Ser Asn Pro Lys Val Glu Asn Glu Asp Glu Glu Leu Ser Ser Ser 
305 310 315 320 

Thr Glu Ser He Asp Asn Ser Phe Ser Ser Pro Val Arg Leu Ala Pro 
325 330 335 

Glu Arg Glu Phe He Lys Ser Leu Met Ala He Gly Lys Arg Leu Ala 
340 345 350 

Thr Leu Pro Thr Lys Glu Gin Lys Thr Gin Arg Leu He Sep- Glu Leu 
355 360 365 

Ser Leu Leu Asn His Lys Leu Pro Ala Arg Val Trp Leu Pro Thr Ala 
370 375 380 

Gly Phe Asp His His Val Val Arg Val Pro His Thr Gin Ala Val Val 
385 390 395 400 

Leu Asn Ser Lys Asp Lys Ala Pro Tyr Leu He Tyr Val Glu Val Leu 
405 410 415 

Glu Cys Glu Asn Phe Asp Thr Thr Ser Val Pro Ala Arg He Pro Glu 
420 425 430 

Asn Arg He Arg Ser Thr Arg Ser Val Glu Asn Leu Pro Glu Cys Gly 
435 440 445 

He Thr His Glu Gin Arg Ala Gly Ser Phe Ser Thr Val Pro Asn Tyr 
450 455 460 
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Asp Asn Asp Asp Glu Ala Trp Ser Val Asp Asp lie Gly Glu Leu Gin 
465 470 475 480 

Val Glu Leu Pro Glu Val His Thr Asn Ser Cys Asp Asn lie Ser Gin 
485 490 495 

Hie Ser Val Asp Ser lie Thr Ser Gin Glu Ser Lys Glu Pro Val Phe 
500 505 510 

lie Ala Ala Gly Asp lie Azg Arg Arg Leu Ser Glu Gin Leu Ala His 
515 520 525 

Thr Pro Thr Ala Phe. Lys Arg Asp Pro Glu Asp Pro Ser Ala Val Ala 
530 535 540 

Leu Lys Glu Pro Trp Gin Glu Lys Val Arg Arg lie Arg Glu Gly Ser 
545 550 555 560 

Pro Tyr Gly His Leu Pro Asn Trp Arg Leu Leu Ser Val lie Val Lys 
565 570 575 

Gly Asp Asp Leu Arg Gin Glu Leu Leu Ala Phe Gin Val Leu Lys 
580 585 590 

Gin Leu Gin Ser lie Trp Glu Gin Glu Arg Val Pro Leu Trp lie Lys 
595 600 605 

Pro lie Gin Asp Ser Qfs Glu lie Thr Thr Asp Ser Gly Met lie Glu 
610 615 620 

Pro Val Val Asn Ala Val Ser lie His Gin Val Lys Lys Gin Ser Gin 
625 630 635 640 

Leu Ser Leu Leu Asp Ty^^ Hie Leu Gin Glu His Gly Ser Tyr Otir Thr 
645 650 655 

Glu Ala Ete Leu Ser Ala Gin Arg Asn Hie Val Gin Ser Cys Ala Gly 
660 665 670 

Tyr Cys Leu Val Cys Tyr Leu Leu Gin Val Lys Asp Arg His Asn Gly 
675 680 685 

Asn lie Leu Leu Asp Ala Glu Gly His lie lie His lie Asp Phe Gly 
690 695 700 

Phe lie Leu Ser Ser Ser Pro Arg Asn Leu Gly Phe Glu Thr Ser Ala 
705 710 715 720 

Phe Lys Leu Thr Thr Glu Phe Val Asp Val Met Gly Gly Leu Asp Gly 
725 730 735 
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Asp Met Phe Asn Tyr Tyr Lys Met Leu Met Leu Gin Gly Leu He Ala 
740 745 750 

Ala Arg Lys His Met Asp Lys Val Val Gin He Val Glu He Met Gin 
755 760 765 

Gin Gly Ser Gin Leu Pro Cys Phe His Gly Ser Ser Thr He Arg Asn 
770 775 780 

Leu Lys Glu Arg Phe His Met Ser Met Thr Glu Glu Gin Leu Gin Leu 
785 790. 795 800 

Leu Val Glu dn Met Val Asp Gly Ser Met Arg Ser He Thr Thr Lys 
805 810 815 

Leu Tyr Asp Gly Phe Gin Tyr Leu Thr Asn Gly He Met 
820 825 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2487 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA( genomic ) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

ATGAGATTCT TGGAAGCTCG AAGTCTGGCT OTGGOCATGG GAGATACAGT ACTGGAGOCT . 60 

GOOCXXTTGA AGOCAACTTC TGAGCCXSVCT TCTGGOOCAC CAGOGAATAA TGGGGGGTOC 120 

CTGCTAAGTG TCATCACOGA GGGGGTOGGG GAACTATCAG TGATTGAOOC TGAGGTGGOC 180 

CAGAAGGOCT GOCAGGAGGT GTTGGAGAAA GTCAAGCTTT TGCATGGAGG CGTGGCAGTC 240 

TCTAGCAGAG GCACOXACT GGAGTTGGTC AATGGGGATG GTGTGGACAG TGftGATOOGT 300 

TGOCTAGATG ATOCAOCTGC CCAGATCAGG GAGGAGGAAG ATGAGATGGG GGOOGCTGTG 360 

GCXnXSVGGCA CAC30CAAAC3G AGCAAGAAGA GGGOGGCAGA ACAACTCAQC TAAACAGTCT 420 

TGGCTGCTGA GGCTGTTTGA GTCAAAACTG TTTGACATCT OCATGGCCAT TTCATAOCTG 480 

TATAACTOCA AGGAGCCTGG AGTACAAGCC TACATTGGCA A00C3GCTCTT CTGCnTOGC 540 
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AAOGAGGWOG TGGACTTCTA TC?rGOOCX»G TTGCTTAACA TGTACATOCA CATGGATGAG 600 

GACXSTGGGTC ATGCX2A.TTAA GOOCTACATA GTOCAOOGTT GOOGOCAGAG CATTAACTTT 660 

1- 

TOOCTGCAGT GTGOOCTGTT GC?rTGGGGOC TATTCTTCAG ACATGCACAT TTOCACTCAA 720 

aSACACTOCX: GTGGGAOCAA GCTACGGAAG CTGATOCTCT CAGATGAGCT AAAGOCAGCT 780 

CACAGGAAGA GGGAGCTGOC CTOCTTGAGC OOGGOOOCTG ATACAGGGCT GTCICCCTCC 840 

AAAAGGACTC AOCAGOGCTC TAAGTCAGAT GOCACTGOCA GCATAAGTCT CAGCAGCAAC 900 

CTGAAAOGAA CAGOCAGCAA COCTAAAGTG GAGAATGAQG ATGAGGAGCT CTCCTOCAGC 960 

ACGGAGAGTA TTGATAATTC ATTCAGTTOC OCTGTroGAC TQGCTCCTGA GAGAGAATTC 1020 

ATCAAGTCCC TGATGGOGAT OGGCAAGCGG CTGGOCACGC 'raXCAOCAA AGAGCAGAAA 1080 

ACACAGAGGC TGATCTCAGA GCTCTOOCTG CTCAAOCATA AGCTOOCTGC GOGAGTCTGG 1140 

CPGOOCACTG CTGGCTTTGA CX»OCAOGTG GTGOGTGTAC OOCACACACA GGCTGTTGTC 1200 

CTCAACTOCA AGGACAAGGC TCOCTACCTG ATTTATGTGG AAGTOCTTGA ATGTGAAAAC 1260 

TTTGACAOCA CXZAGTGTCXX: TGCXX3GGATC CCCGAGAACX: GAATTOGGAG TAOGAGGTOC 1320 

GTAGAAAACT TGOOOGAATG TGGTATTAOC CATGAGCAGC GAGCTGGCAG CTTCAGCACT 1380 

GTQOOCAACT AOXSACAACGA TGATGAGGOC TGGTOOGTOG ATGACATAGG OGAOCTOCAA 1440 

GTGGAGCTOC CXX3AAGTGCA TAlOCAACAGC TGTGACAACA TCTOCCAGTT CTCTGTGGAC 1500 

AGCATCAOCA GOCAGGAGAG CAAGGAGC3CT GTGTTCATTG CAGCAGGGGA CATGOGOOGG 1560 

OGCX^TTTOGG AACAGCTGGC TCATAOOGOG ACAGOCTTCA AAOSAGAOCC AGAAGATOCT 1620 

TCTGCAGTTG CTCTCAAAGA OOOCTOGCAG GAGAAAGTAC GGOGGATCAG AGAGGGCTOC 1680 

CXCTAlOGGOC ATCTOOOCAA TTGGOGGCTC CTGTCAGTCA TTGTCAAGTG TGGGGATGAC 1740 

CTTOGGCAAG AGCTTCTGGC CTTTCAGGTG TTGAAGCAAC TGCAGTOCAT TTGGGAACAG 1800 

GAGCGAGTGC COCTTTGGAT CAAGGCAATA CAAGATTCTT GTGAAATT7VC GACTGATAGT 1860 

GGCATGATTG AAOCAGTGGT CAATGCTGTG TOCATOCATC AGGTGAAGAA ACAGTCACAG 1920 

CTCTOCTTGC TCGATTACTT OCTACAGGAG CAOGGCAGTT ACAOCACTGA GGCATTOCTC 1980 

AGTGCACAGC GCAATTTTGT GCAAAGTTGT GCTGGGTACT GCTTGGTCTG CTAOCTGCTG 2040 
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CAAGTCAAGG ACAGACACAA TGGGAATATC CTTTTGGACG CAGAAGGOCA CATCATOCAC 2100 

ATOGACTTTG GCTTTCATOCrr CTOCAGCTCA CXXXX3AAATC TGGGCTTTGA GACX3TCAG0C 2160 

TTTAAGCTGA CXSVCAGAGTT TGTGGATGTG ATGGGOGGCX: TGGATOOOGA CaTGTTCAAC 2220 

TACTATAAGA TGCTGATGCT GCAAGGGCTG ATTGCXXXTTC GGAAACACAT GGACAAGGTO 2280 

GTGCAGATOG TGGAGATCAT GCAGCAAGGT TCTCAGCrTTC CTTGCTTOCA TGGCTOCAGC 2340 

AOCATTOGAA AOCTCAAAGA GAOGTTOCAC ATGAOCATGA CTGAGGAGCA GCTGCAGCTG 2400 

CTOSIX9GAOC AGATOGTOGA TGGCAGTATG CX3GTCTATCA CCAOCAAACT CTATGAOOOC 2460 

TTOCAGTAOC TCACCAACGG CATCATG 2487 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3324 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA( genomic) 
(ill) HYPOTHETICAL: NO 
(Iv) ANTI-SENSE: NO 

(Vll) IMMEDIATE SOURCE: " 

(A) LIBRARY: Human fetal brain cDNA library 

(B) CLONE: GEN-428B12C1 

(Ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 115.. 2601 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 



COGGAATTCC GGGAAGGOCG GAGCAAGTTT TGAAGAAGTC OCTATCAGAT TACACTTGGT 60 

TGACTACTOC GGAGCAGOCA CTAAGAGGGA TGAACAGGGC TGOGTGGAAA TTGA ATG 117 

Met 
1 



AGA TTC TTG GAA OCT OGA AGT CTG OCT GTG GOC ATG GGA GAT ACA GTA 
Arg Phe Leu Glu Ala Arg Ser Leu Ala Val Ala Met Gly Asp Thr Val 



165 



r 
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5 10 15 

GTG GAG OCT GCC CCC TTG AAG OCA ACT TOT GAG OOC ACT TCT GGC OCA 213 
Val Glu Pro Ala Pro Leu Lys Pro Thr Ser Glu Pro Thr Ser Gly Pro 
20 25 30 

OCA GGG AAT AAT GGG GGG TOC CTG OTA AGT GTC ATC AOG GAG GGG GTC 261 
Pro Gly Asn Asn Gly Gly Ser Leu Leu Ser Val lie Thr Glu Gly Val 
35 40 45 

GGG GAA CTA TCA GTG ATT. GAC OCT GAG GTG GOC CAG AAG GCC TGC GAG 309 
Gly Glu Leu Ser Val lie Asp Pro Glu Val Ala Gin Lys Ala qys Glxi 
50 55 60 65 

GAG GTG TTG GAG AAA GTC AAG CTT TTG CAT GGA GGC GTG GCA GTC TCT 357 
Glu Val L^ Glu Lys Val Lys Leu Leu His Gly Gly Val Ala Val Ser 

70 75 80 

AGO AGA GGC ACC OCA CTG GAG TTG GTC AAT GGG GAT GGT GTG GAC AGT 405 
Ser Arg Gly Thr Pro Leu Glu Leu Val Asn Gly Asp Gly Val Asp Ser 
85 90 95 

GAG ATC OGT TGC CTA GAT GAT OCA OCT GOC CAG ATC AGG GAG GAG GAA 453 
Glu lie Arg Cys Leu Asp Asp Pro Pro Ala Gin lie Axg Glu Glu Glu 
100 105 110 

GAT GAG ATG GGG GOC GCT GTG GOC TCA GGC ACA GOC AAA GGA GCA AGA 501 
Asp Glu Met Gly Ala Ala Val Ala Ser Gly Ttac Ala Lys Gly Ala Arg 
115 120 125 

AGA OGG OGG CAG AAC AAC TCA GCT AAA CAG TCT TGG CTG CTG AGG CTG 549 
Arg Arg Arg Gin Asn Asn Ser Ala Lys Gin Ser Trp Leu Leu Arg Leu 
130 135 140 145 

TTT GAG TCA AAA CTG TTT GAC ATC TOC ATG GOC ATT TCA TAC CTG TAT 597 
Phe Glu Ser Lys Leu Phe Asp lie Ser Met Ala He Ser Tyr Leu Tyr 
150 155 160 

AAC TOO AAG GAG OCT GGA GTA CAA GOC TAC ATT GGC AAC OGG CTC TTC 645 
Asn Ser Lys Glu Pro Gly Val Gin Ala Tyr He Gly Asn Arg Leu Phe 
165 170 175 

TGC TTT CGC AAC GAG GAC GTG GAC TTC TAT CTG OOC CAG TTG CTT AAC 693 
Cys Phe Arg Asn Glu Asp Val Asp Phe Tyr Leu Pro Gin Leu Leu Asn 
180 185 190 

ATG TAC ATC CAC ATG GAT GAG GAC GTG GGT GAT GOC ATT AAG COC TAC 741 
Met Tyr He His Met' Asp Glu Asp Val Gly Asp Ala He Lys Pro Tyr 
195 200 205 
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ATA GTC CAC OCT TGC OGC GAG AGC ATT AAC TTT TCX: CTC GAG TCT GOC 789 
lie Val His Arg Cys Arg Gin Ser He Asn Phe Ser Leu Gin Cys Ala 
210 215 220 225 

CTG TTG CTT GGG GOC TAT TCT TCA GAC ATG CAC ATT TOO ACT CAA OGA 837 
Leu Leu Leu Gly Ala Tyr Ser Ser Asp Met His He Ser Thr Gin Arg 
230 235 240 

CAC TOO OCT GGG AOC AAG OTA OGG AAG CTG ATC CTC TCA GAT GAG CTA 885 
His Ser Arg Gly Thr Lys Leu Arg Lys Leu lie Leu Ser Asp Glu Leu 
245 250 255 

AAG OCA GOT CAC AGG AAG AGG GAG CTG OCC TOC TTG AGC COG GOC OCT 933 
Lys Pro Ala His Arg Lys Arg Glu Leu Pro Ser Leu Ser Pro Ala Pro 
260 265 270 

GAT ACA GGG CTG TCT OCC TOC AAA AGG ACT CAC CAG OGC TCT AAG TCA 981 
Asp Thr Gly Leu Ser Pro Ser Lys Arg Thr His Gin Arg Ser Lys Ser 
275 280 285 

GAT GOC ACT GOC AGC ATA ACT CTC AGC MX: AAC CTG AAA OGA ACA GOC 1029 
Asp Ala Thr Ala Ser lie Ser Leu Ser Ser I^sa Leu Lys Arg Thr Ala 
290 295 300 305 

AGC AAC OCT AAA GTG GAG AAT GAG GAT GAG GAG CTC TOC TOC AGC ADC 1077 
Ser Asn Pro Lys Val Glu Asn Glu Asp Glu Glu Leu Ser Ser Ser Thr 
310 315 320 

GAG ACT ATT GAT AAT TCA TTC ACT TOC OCT GTT OGA CTG GCT OCT GAG 1125 
Glu Ser He Asp Asn Ser Phe Ser Ser Pro Val Arg Leu Ala Pro Glu 
325 330 335. 

AGA GAA TTC ATC AAG TOC CTG ATG GOG ATC GGC AAG OGG CTG GOC AOG 1173 
Arg Glu Phe He Lys Ser Leu Met Ala He Gly Lys Arg Leu Ala Thr 
340 345 350 

CTC OCC AOC AAA GAG CAG AAA ACA CAG AGG CTG ATC TCA GAG CTC TOC 1221 
Leu Pro Thr Lys Glu Gin Lys Thr Gin Arg Leu He Ser Glu Leu Ser 
355 360 365 

CTG CTC AAC CAT AAG CTC OCT GOC OGA GTC TGG CTG OOC ACT GCT GGC 1269 
Leu Leu Asn His Lys Leu Pro Ala Arg Val Trp Leu Pro Thr Ala Gly 
370 375 380 385 

TTT GAC CAC CAC GTG GTC OCT CTA OCC CAC ACA CAG GCT GTT GTC CTC 1317 
Phe Asp His His Val Val Arg Val Pro His Thr Gin Ala Val Val Leu 
390 395 400 

AAC TOC AAG GAC AAG GCT OOC TAC CTG ATT TAT GTG GAA GTC CTT GAA 1365 
Asn Ser Lys Asp Lys Ala Pro Tyr Leu He Tyr Val Glu Val Leu Glu 
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405 410 415 

TGT GAA AAC TTT GAC AOC AOC AGT GTC CCT GOC OGG ATC CXX: GAG AAC 1413 
cys Glu Asn Phe Asp Thr Thr Ser Val Pro Ala Arg lie Pro Glu Rsn 
420 425 430 

OGA ATT CX3G ACT AOS AGG TCC CTA GAA AAC TTG OOC GAA TGT GOT ATT 1461 
Arg lie Arg Ser Thr Arg Ser Val Glu Asn Leu Pro Glu Cys Gly lie 
435 440 445 

ADC CAT GAG CAG OGA GCT.GGC AGC TTC AGC ACT GTG OOC AAC TAT GAC 1509 
Thr His Glu Gin Arg Ala Gly Ser Phe Ser Thr Val Pro Asn Tyr Asp 
450 455 460 465 

AAC GAT GAT GAG GCC TGG TOG GTG GAT GAC ATA GGC GAG CT6 CAA GTG 1557 
Asn Asp Asp Glu Ala Trp Ser Val Asp Asp lie Gly Glu Leu Gin Val 
470 475 480 

GAG CrO OOC GAA GTG CAT ACC AAC AGC TCT GAC AAC ATC TCC CAG TTC 1605 
Glu Leu Pro Glu Val His Thr Asn Ser Cys Asp Asn He Ser Gin Hie 
485 490 495 

TCT GTG GAC AGC ATC AOC AGC CAG GAG AGC AAG GAG OCT GTG TTC ATT 1653 
Ser Val Asp Ser He Thr Ser Gin Glu Ser Lys Glu Pro Val Phe He 
500 505 510 

GOA GCA GGG GAC ATC OGC OGG OGC CTT TOG GAA CAG CTG GCT CAT AOC 1701 
Ala Ala Gly Asp He Arg Arg Arg Leu Ser Glu Gin Leu Ala His TOir 
515 520 525 

OOG ACA GOC TTC AAA OGA GAC OCA GAA GAT. OCT TCT GCA GTT GCT OTO 1749 
Pro Thr Ala Phe Lys Arg Asp Pro Glu Asp Pro Ser Ala Val Ala Leu 
530 535 540 545 

AAA GAG OOC TGG CAG GAG AAA CTA OGG OGG ATC AGA GAG GOC TOC OOC 1797 
Lys Glu Pro Trp Gin Glu Lys Val Arg Arg He Arg Glu Gly Ser Pro 
550 555 560 

TAC GGC CAT CTC OOC AAT TGG OGG OTO CTG TOA GTC ATT GTC AAG TCT 1845 
Tyr Gly His Leu Pro Asn Trp Arg Leu Leu Ser Val He Val Lys 
565 570 575 

GGG GAT GAC CTT OGG CAA GAG CTT CTG GGC TTT CAG GTG TTG AAG CAA 1893 
Gly Asp Asp Leu Arg Gin Glu Leu Leu Ala Phe Gin Val Leu Lys Gin 
580 585 590 

CTG CAG TOC ATT TGG GAA CAG GAG OGA GTG OOC CTT TGG ATC AAG OCA 1941 
Leu Gin Ser He Trp- Glu Gin Glu Arg Val Pro Leu Trp He Lys Pro 
595 600 605 
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ATA CAA GAT TCT TGT GAA ATT AOG ACT GAT ACT GGC ATG ATT GAA CCA 1989 
lie Gin Asp Ser Cys Glu lie Thr Thr Asp Ser Gly Met lie Glu Pro 
610 615 620 625 

GTG GTC AAT OCT GTG TCX: ATC CAT GAG GTG AAG AAA CAG TCA GAG CTC 2037 
Val Val Asn Ala Val Ser lie His Gin Val Lys Lys Gin Ser Gin Leu 
630 635 640 

TOC TTG CTC GAT TAG TTC CTA CAG GAG CAC GGC AGT TAG ACC ACT GAG 2085 
Ser Leu Leu Asp Tyr Phe Leu Gin Glu His Gly Ser Tyr Thr Thr Glu 
645 650 655 

GCA TTC CTC AGT GCA CAG GGC AAT TTT GTG CAA AGT TGT OCT GGG TAC 2133 
Ala Phe Leu Ser Ala Gin Arg Asn Phe Val Gin Ser Qfs 7U.a Gly Tyr 
660 665 670 

TGC TTG GTC TGC TAC CTG CTG CAA GTC AAG GAC AGA CAC AAT GGG AAT 2181 
Cys Leu Val Cys Tyr Leu Leu Gin Val Lys Asp Arg His Asn Gly Asn 
675 680 685 

ATC CTT TTG GAC GCA GAA GGC CAC ATC ATC CAC ATC GAC TTT GGC TTC 2229 
lie Leu Leu Asp Ala Glu Gly His lie lie His lie Asp Phe Gly Phe 
690 695 700 705 

ATC CTC TOC AGC TCA OOC OGA AAT CTG GGC TTT GAG AOG TCA GOC TTT 2277 
lie Leu Ser Ser Ser Pro Arg Asn Leu Gly Phe Glu Thr Ser Ala Phe 
710 715 720 

AAG CTG ACC ACA GAG TTT GTG GAT GTG ATG GGC GGC CTG GAT GGC GAC 2325 
Lys Leu Thr Thr Glu Phe Val Asp Val Mel: Gly Gly Leu Asp Gly Asp 
725 730 735 

ATG TTC AAC TAC TAT AAG ATG CTG ATG CTG CAA GGG CTG ATT GOC GCT 2373 
Met Fhe Asn Tyr Tyr Lys Met Leu Met Leu Gin Gly Leu lie Ala Ala 
740 745 750 

CGG AAA CAC ATG GAC AAG GTG GTG CAG ATC GTG GAG ATC ATG CAG CAA 2421 
Arg Lys His Met Asp Lys Val Val Gin lie Val Glu lie Met Gin Gin 
755 760 765 

GGT TCT CAG CTT OCT TGC TTC CAT GGC TOC AGC ACC ATT OGA AAC CTC 2469 
Gly Ser Gin Leu Pro Cys Rie His Gly Ser Ser Thr lie Arg Asn Leu 
770 775 780 785 

AAA GAG AGG TTC CAC ATG AGC ATG ACT GAG GAG CAG CTG CAG CTG CTG 2517 
Lys Glu Arg Phe His Met Ser Met Thr Glu Glu Gin Leu Gin Leu Leu 
790 795 800 

GTG GAG CAG ATG GTG GAT GGC AGT ATG OGG TCT ATC AGO ACC AAA CTC 2565 
Val Glu Gin Met Val Asp Gly Ser Met Arg Ser lie Thr Thr Lys Leu 
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805 810 815 

TAT GAC GGC TTC CAG TAG CTC ACC AAC GGC ATC ATG TGA CAOGCTOCTC 2614 
Tyr Asp Gly Phe Gin Tyr Leu Thr Asn Gly lie Met * *" 
820 825 830 

AOGOCAGGAG T0GT0GOC30G TCCAOOGCAC CCTOOCTAGA GGGOOCTTGT CTGAGAAACC 2674 

CX^AAOCAGG AAAOCXXACC TACX3CAAIC3CA TCX^CXTAAG GGAAATGGAA GGCAAGAAAC 2734 

AOGAAOGATC ATGTGGTAAC TOOGAGAOCT TGCTGAGGGG TGGGAGAGOC AGCTGTGGOG 2794 

TOCAGACTTG TTGGQGCTTC OCTGOCOCTC CTGGTCTGTG TCAGTATTAC CAOCAGACFG 2854 

ACTOCAOGAC TCACT00C3CT OCAGAAAACA GAGGTGACAA ATGTGAOGGA CACTOOGOOC 2914 

TTTCTTCTOC TTGTAGGGGT CTCTCAGAGG TTCTTTOCAC AGGCXS^TCCT CTTATTOOGT 2974 

TCTOGGGPCX: AGGAAGTOGG GAA6AGTA0G TTCTCX3GTAC TTAOGACTTG ATCCTGTGGT 3034 

TGOCACTGGC C3VTGCTGCTG OOCAGCTCTA COOCTOOCAG OGACJCTAOX: CTCOCAOGGA 3094 

<30GACXXX?rG GCXTAAGCTC OOCTTGCTQG CX3GGGGCTGC GTGGGOOCTG CACTTGCTGA 3154 

GGTTGOOCAT CATGQGCAAG GCAAGGGAAT TCCX^CAGOC CTOCAGTGTA CTGAGGGTAC 3214 

TOGGCTAOOC ATGTGGAATT GCXJIACCCTG ACTOCTTCOC CAAAOOCAGG GAAAAGAGCT 3274 

CTCAATTTTT TATTTTTAAT TTTTGTTTGA AATAAAGTOC TTAGTTAQCX: 3324 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 810 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Met Pro Met Asp Leu lie Leu Val Val Trp Phe Cys Val Cys Thr Ala 
15 10 15 

Arg Thr Val Val Gly Phe Gly Met Asp Pro Asp Leu Gin Met Asp lie 
20 25 30 

Val Thr Glu Leu Asp Leu Val Asn Thr Thr Leu Gly Val Ala Gin Val 
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35 



40 



45 



Ser Gly Met His Asn Ala Ser Lys Ala Phe Leu Phe Gin Asp He Glu 
50 55 60 

Arg Glu He His Ala Ala iE>ro His Val Ser Glu Lys Leu He Gin Leu 
65 70 75 80 

Phe Gin Asn Lys Ser Glu Phe Thr He Leu Ala Thr Val Gin Gin Lys 

85 90 95 

Pro Ser Thr Ser Gly Val He Leu Ser He Arg Glu Leu Glu His Ser 
100 105 110 

Tyr Hhe GLu Leu Glu Ser Ser Gly Leu Arg Asp Glu He Arg Tyr His 
115 120 125 

Tyr He His Asn Gly Lys Pro Arg Thr Glu Ala Leu Pro Tyr Arg Met 
130 135 140 

Ala Asp Gly Gin Trp His Lys Val Ala Lai Ser Val Ser Ala Ser His 
145 150 155 160 

leu Ifiu teu His Val Asp Asn Arg He Tyr Glu Arg Vai He Asp 
165 170 175 

Pro Pro Asp Thr Asn Leu Pro Pro Gly He Asn Leu Trp Leu Gly Gin 
180 185 190 

Arg Asn Gin Lys His Gly Leu Phe Lys Gly He He Gin Asp Gly Lys 
195 200 205 

He He Phe Met Pro Asn Gly Tyr He Thr Gin Cys Pro Asn Leu Asn 
210 215 220 

His rtw Cys Pro Thr Cys Ser Asp Phe Leu Ser Leu Val Gin Gly He 
225 230 235 240 

Met Asp I^u Gin Glu Leu Leu Ala Lys Met Thr Ala Lys Leu Asn Tyr 
245 250 255 

Ala Glu Thr Arg Leu Ser Glxi Leu Glu Asn Cys His Cys Glu Lys Thr 
260 265 270 

Qrs Gin Val Ser Gly Leu Leu Tyr Arg Asp Gin Asp Ser Trp Val Asp 
275 280 285 

Gly Asp His Cys Arg Asn Cys Thr Lys Ser Gly Ala Val Glu Cys 
290 295 300 
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Acg Arg Met Ser Cys Pro Pro Leu Asn Cys Ser Pro Asp Ser Pro 
305 310 315 320 

Val His lie Ala Gly Gin Cys Cys Lys Val Cys Arg Pro Lys Cys JLle 
325 330 335 

Tyr Gly Gly Lys Val Leu Ala Glu Gly Gin Arg He Leu Thr Lys Ser 
340 345 350 

Cys Aarg Glu Cys Arg Gly Gly Val Leu Val Lys He Thr Glu Met Cys 
355 360 365 

Pro Pro Leu Asn Cys Ser Glu Lys Asp His He Leu Pro Glu Asn Gin 
370 375 380 

Cys Arg Val Cys Arg Gly His Asn Rie Cys Ala Glu Gly Pro Lys 
385 390 395 400 

Cys Gly Glu Asn Ser Glu Cys Lys Asn Trp Asn Thr Lys Ala Thr Cys 
405 410 415 

Glu Cys Lys Ser Gly Tyr He Ser Val Gin Gly Asp Ser Ala Tyr Cys 
420 425 430 

Glu Asp He Asp Glu Cys Ala Ala Lys Met His Tyr Cys His Ala Asn 
435 440 445 

Thr Val Cys Val Asn Leu Pro Gly Leu Tyr Arg Cys Asp Cys Val Pro 
450 455 460 

Gly Tyr He Arg Val Asp Asp Phe Ser Cys Thr Glu HdLs Asp Glu Cys 
465 470 475 480 

Gly Ser Gly Gin His Asn Cys Asp Glu Asn Ala He Cys Thr Asn Thr 
485 490 495 

Val Gin Gly His Ser Cys Tttir Lys Pro Gly Tyr Val Gly Asn Gly 
500 505 510 

Thr He Cys Arg Ala Rie Cys Glu Glu Gly Cys Arg Tyr Gly Gly Thr 
515 520 



Cys Val Ala Pro Asn Lys Cys Val Cys Pro Ser Gly Phe Thr Gly Ser 
530 535 540 

His Cys Glu Lys Asp He Asp Glu Cys Ser Glu Gly He He Glu Cys 
545 550 555 560 

His Asn His Ser Arg Cys Val Asn Leu Pro Gly Trp Tyr His Cys Glu 
565 570 575 
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Cys Arg Ser Gly Phe His Asp Asp Gly Thr Tyr Ser Leu Ser Gly Glu 
580 585 590 

Ser Cys lie Asp lie Asp Glu Cys Ala Leu Arg Thr His Thr Cys Trp 
595 600 605 

Asn Asp Ser Ala Cys lie Asn Leu Ala Gly Gly Flie Asp Cys Leu Cys 
610 615 620 

Pto Ser Gly Pro Ser Cys Ser Gly T^sp Cys Pro His Glu Gly Gly Leu 
625 630 . 635 640 

Lys His Asn Gly Gin Val Trp Thr Leu Lys GLu Asfp Arg Cys Ser Val 
645 650 655 

cys Ser cys Lys Asp Gly Lys lie Phe cys Arg Axg Thr Ala Cys Asp 
660 665 670 

cys Gin Asn Pro Ser Ala Asp Leu Phe Cys Cys Pro Glu Cys Asp Thr 
675 680 685 

Axg Val Thr Ser Gin Cys Leu Asp Gin Asn Gly His Lys Leu Tyr Arg 
690 695 700 

Ser Gly Asp Asn Trp Thr His Ser cys Gin Gin cys Arg Cys Leu Glu 
705 710 715 720 

Gly Glu Val Asp cys Trp Pro Leu Thr cys Pro Asn Leu Ser cys Glu 
725 730 735 

Tyr Ittr Ala lie Leu Glu Gly Glu Cys Cys Pro Arg Cys Val^Ser Asp 
740 745 750 

Pro cys Leu Ala Asp Asn lie Thr Tyr Asp lie Arg Lys Thr Cys Leu 
755 760 765 

Asp Ser Tyr Gly Val Ser Axg Leu Ser Gly Ser Val Trp Thr Met Ala 
770 775 780 

Gly Ser Pro cys Tta: Thr cys Lys Cys Lys Asn Gly Arg Val cys Cys 
785 790 795 800 

Ser Val Asp Rie Glu Cys Leu Gin Asn Asn 
805 810 



(2) INFORMATION FOR SEQ ID NO: 35: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2430 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

T- 

(ii) MOLECULE TYPE: DNA( genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

ATQOOGATGG ATTTGATTTT AGTTGTGTGG TTCTGTGTGT GCACTQCCAG GACAGTGGTiS 60 

GGCTTTGGGA TCGAOCX^TGA CCTTCAGATG GATATGGTCA O^GAGCTTGA OCTTGTGAAC 120 

ADCAOOCTTG GAGTTGCTCA GGTGTCTQGA ATGCACAATG OCAGCAAAGC ATTTTTATTT 180 

CAASACATA6 AA^GAGAGAT OCATGCAGCT CCTCATGTGA GrTGAGAAATT AATTCA0CT6 240 

TTCCAGAACA AGAGTGAATT CAOCATTTTG GOCACTGTAC AOCAGAAGOC ATOCACTTCA 300 

GGAGTGATT^ TGTCCATTOG AGAACTGGAG CACAGCTATT TTGAACTGGA GAGCAGTGGC 360 

CTGAGOGATG AGATTOGGTA TCACTACATA CACAATGGGA AGCCAAGGAC AGAGGCACTT 420 

OCTTAOOGCA TGGCAGATGG ACAATGGCAC AAGGTTGCAC TGTCAGTTAG OGOCTCTCAT 480 

CTOCTGCTCC ATGT0GACT6 TAACAG6ATT TATGAGOGTG TGATAGAOOC TOCAGATAOC 540 

AAOCTTCOOC CAGGAATCAA TTTATGGCTT GGCCAGCX5CA AOCAAAAGCA TGGCTTATTC 600 

AAAGGGATCA TCCAAGATGG GAAGATCATC TTTATGOOGA ATGGATATAT AACACAGTGT 660 

OCAAATCTAA ATCACACTTG COCAAOCTGC AGTGATTTCT TAAGOCTGGT GCAAGGAATA 720 

ATOGATTTAC AA6AGCTTTT GOOCAAGATG ACTGCAAAAC TAAATTATGC AGAGACAAGA 780 

CTTAGTCAAT TGGAAAACTG TCATTGTGAG AAGACTTGTC AAGTGAGTGG ACTGCTCTAT 840 

OGAGATCAAG ACTCTTGGGT T^TGGTGAC CATTGCAG6A ACTGCACTTG CAAAAGTGGT 900 

GOOGTGGAAT GOOGAAGGAT GTa^KSTOOC OCTCTCAATT GCTOOOCAGA CTOOCTGOCA 960 

GTACACATTG CTGGOCAGTG CTGTAAGGTC TGOOGAOCAA AATGTATCTA TOGAGGAAAA 1020 

GTTCTTGCAG AAGGCXIAGCG GATTTTAACC 7VAGAGCTGTC GGGAATGOOG AGGTGGAGTT 1080 

TTAGTAAAAA TTACAGAAAT GTGTOCTOCT TTGAACTGCT CAGAAAAGGA TCACATTCTT 1140 

OCTGAGAATC AGTGCTGOGG TGTCTGTAGA GGTCATAACT TTTGTGCAGA AGGAOCTAAA 1200 

TGTGGTGAAA ACTCAGAGTG CAAAAACTGG AATACAAAAG CTACTTGTGA GTGCAAGAGT 1260 
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GGTTACATCT CTGTOCAGGG AGACTCTGCC TACTOTGAAG ATATTGATGA GTGTGCAGCT 1320 

AAGATGCATT ACTGTCATGC CAATACTGTG TGTGTCAAC3C TTOCTGGGTT ATATQGCTGT 1380 

GACTGTGTCC CAOGATACAT TOGTGTQGAT GACTTCTCTT GTACAGAACA CX3ATGAATGT 1440 

OOCAOOGOOC AOCACAACIG TGATGAGAAT OOCATCTOCA CX:AACACrGT CXiAOOGACAC 1500 

AGCTGCAOCT GCAAAOOGGG CTAOGTGGGG AAOGGGAOCA TCTGCAGAGC TTTCTGTGAA 1560 

GAOGGCIOCA GATAOSGTOG AJkOGTGTCTG GCTOOCAACA AATGTGTCIG TOCATCTGGA 1620 

TTCACAOGAA OCX:ACTOOGA GAAAGATATT GATGAATGTT CAGAOOGAAT CATTCSVGTGC 1680 

CACAAOCATT CCCGCTGCGT TAAOCTGOCA OGGTGGTAOC ACIGTGAGTG CA6AA00GGT 1740 

TTOCATGAOG ATGGGAOCTA TTCACTGTOC GGGGAGTOCT GTATTGACAT TGATGAATGT 1800 

GCCTTAAGAA CTCACAOCIG TTGGAAC6AT TCTOOCTOCA TCAAOCTGGC AGGOOGTTTT 1860 

GACTGTCnXn' G00CX:TCT06 OCXXTCXTI^ 1920 

AA0CACAAT6 GOCAGGTGTG GAOCTTGAAA GAAGACAOGT GTTCTGTCI6 CTOCIOCAAG 1980 

GATGGCAAGA TATTCTGOOG AOGGACAGCT TGTGATTQOC AGAATOCAAG TGCTGAOCTA 2040 

TTCTGTTOOC CAGAATGTGA CIACDCAGAGTC ACAAGTCAAT GTTTAGACXSV AAATGGTCAC 2100 

AAGCTGTATC GAACTGGAGA CAATTOGAOC CATAOCTOIC AGCAGTGTCX? GTGTCTGGAA 2160 

OGAGAOGTA6 ATTGCTGGOC ACTCACTTOC CXXIAACTTGA GCIGrGAGTA TACAOCTATC 2220 

TTAGAAOOGG AATGTTGTOC OOGCTGTCTC AGTGAOOOCT GOCTAGCTGA TAACATCACX: 2280 

TATGACATCA GAAAAACTTG CXTTGGACAGC TATGGTGITT CAO0GCTTA6 TGGCTCAG?rG 2340 

TOQAOGATOG CTOGATCTOC CIGCACAACX: TGTAAATOCA AGAATGGAAG AGTCTGITGr 2400 

TCIGTGGATT TTGAGTGTCT TCAAAATAAT 2430 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2977 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA( genomic ) 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(vli) IMMEDIATE SOURCE: 

(A) LIBRARY: Human fetal brain cDNA library 

(B) CLONE: GEN-073E07 

(ix) FEATURE: 

(A) N7VME/KEY: CDS 

(B) LOCATION: 103.. 2532 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

TAGCAAGTTT GGOQGCTCCA AGOCAGGOGC GOCTCAGGAT CCAGGCTCAT TTGCTTOCAC 60 
GTAGCTTOGG TQOOOOCTGC TAGGOGGGGA OOCTOGAGAG CG ATG COG ATG GAT 114 



Met Pro Met Asp 
1 



TTG ATT TTA GTT GTG TGG TTC TGT GTG TGC ACT GOC AGG ACA GTG GTG 
Leu lie Leu Val Val Trp Phe Cys Val Cys Thr Ma Arg Thr Val Val 
5 10 15 20 



162 



GGC TTT GGG ATG GAC OCT 6AC CTT CAG ATG GAT ATC GTC AOC GAG CTT 
Gly Rie Gly Met Asp Pro Asp Leu Gin Met Asp lie Val Thr Glu Leu 

25 30 35 



210 



GAC CTT GTG AAC AOC AOC CTT GGA GTT GCT CAG GTG TCT GGA ATG CAC 
7^ Leu Val Asn Thr Thr Leu Gly Val Ala Gin Val Ser Gly Met His 
40 45 50 



258 



AAT GOC AGO AAA GCA TTT TTA TTT CAA GAC ATA GAA AGA GAG ATC CAT 
Asn Ala Ser Lys Ala Phe Leu Phe Gin Asp lie Glu Arg Glu lie His 
55 60 65 



306 



GCA GCT OCT CAT GTG AGT GAG AAA TTA ATT CAG OTG TTC CAG T^AC AAG 
Ala Ala Pro His Val Ser Glu Lys Leu lie Gin Leu Phe Gin Asn Lys 
70 75 80 



354 



V 



AGT GAA TTC AOC ATT TTG GOC ACT GTA CAG CAG 7^ OCA TOC ACT TCA 
Ser Glu Pty& Thr lie Leu Ala Thr Val Gin Gin Lys Pro Ser Thr Ser 
85 90 95 100 



402 



GGA GTG ATA CTG TOC ATT OGA GAA CTG GAG CAC AGO TAT TTT GAA OTG 
Gly Val He Leu Ser He TVrg Glu Leu Glu His Ser Tyr Phe Glu Leu 
105 110 115 



450 



GAG AGC AGT GGC CTG AGG GAT GAG ATT OGG TAT CAC TAG ATA CAC AAT 



498 
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Glu Ser Ser Gly Leu Arg Asp Glu lie Arg Tyr His Tyr lie His Asn 
120 125 130 

GGG AAG OCA AOG ACA GAG GCA CTT CCT TAG CGC AT6 GCA GAT GGA CAA 546 
Gly lys Pro Axg Itvr Glu Ala Leu Pro Arg Mel: Ala Asp Gly ^Ln 
135 140 145 

TGG CAC AAG GTT GCA CTG TCA GTT AGC GOC TCT CAT CTC CTG CTC CAT 594 
Trp His Lys Val Ala Leu Ser Val Ser Ala Ser His Leu Leu Leu His 
150 155 160 

GTC GAC TGT AAC AGG ATT TAT GAG CGT GT6 ATA GAC OCT OCA GAT AOC 642 
Val Asp Cys Asn Axg lie Tyr Glu Axg Val lie Asp Pro Pro Asp 'ttar 
165 170 175 180 

AAC CTT OOC CCA GGA ATC AAT TTA TGG CTT GOC CAG CGC AAC CAA AAG 690 
Asa Leu Pro Pro Gly lie Asn Leu Trp Leu Gly Gin Arg Asn Gin Lys 
185 190 195 

CAT GGC TTA TTC AAA GGG ATC ATC CAA GAT GOG AAG ATC ATC TTT ATG 738 
His Gly Leu Phe Lys Gly lie lie Gin Asp Gly Lys lie lie Fhe Met 
200 205 210 

CCG AAT GGA TAT ATA ACA CAG TGT OCA AAT CTA AAT CAC ACT TGC OCA 786 
Pro Asn Gly Tyr lie Thr Gin Cys Pro Asn Leu Asn His Thr Cys Pro 
215 220 225 

AOC TGC AGT GAT TTC TTA AOC CTG GT6 CAA GGA ATA ATG GAT TTA CAA 834 
Thr Cys Ser Asp Pha. Leu Ser Leu Val GOn Gly lie Met: Asp Leu 61n 
230 235 240 

GAG CTT TTG GOC AAG ATG ACT GCA AAA CTA AAT TAT GCA GAGTACA AGA 882 
Glu Leu Leu Ala Lys Met Thr Ala Lys Leu Asn Tyr Ala Glu Thr Airg 
245 250 255 260 

CTT AGT CAA TTG GAA AAC TGT CAT TGT GAG AAG ACT TGT CAA GIG AGT 930 
Leu Ser Gin Leu Glu Asn Cys His Cys Glu Lys Uta: Cys Gin Val Ser 
265 270 275 

GGA CTG CTC TAT OGA GAT CAA GAC TCT TGG GTA GAT GOT GAC CAT TGC 978 
Gly Leu Leu Tyr Arg Asp Gin Asp Ser Trp Val Asp Gly Asp His Cys 
280 285 290 

AGG AAC TGC ACT TGC AAA AGT GOT GOC GTG GAA TGC OGA AGG ATG TOC 1026 
Arg Asn Cys Ihr Cys Lys Ser Gly Ala Val Glu Cys Arg Arg Met Ser 
295 300 305 

TGT OOC OCT CTC AAT TGC TOC OCA GAC TGC CTC OCA GTA CAC ATT OCT 1074 
Cys Pro Pro Leu Asn Ser Pro Asp Ser Leu Pro Val His lie Ala 
310 315 320 
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GGC CAG TGC TGT AAG GTC TGC OGA CCA AAA TGT ATC TAT GGA GGA A7»A 1122 
Gly Gin Cys Cys Lys Val Cys Arg Pro Lys Cys lie Ty^ Gly Gly Lys 
325 330 335 340 

GTT CTT GCA GAA GGC CAG OGG ATT TTA AOC AAG AGC TGT CGG GAA TGC 1170 
Val Leu Ala Glu Gly Gin Arg lie Leu Thr Lys Ser Cys Arg Glu Cys 
345 350 355 

GGA GGT GGA GTT TTA GTA AAA ATT ACA GAA ATG TGT OCT OCT TTG AAC 1218 
Arg Gly Gly Val Leu Val Lys lie Thr Glu Met C^ Pro Pro Leu Asn 
360 365 370 

TGC TCA GAA AAG GAT CAC ATT CTT OCT GAG AAT CAG TGC TGC OGT GTC 1266 
Cys Ser Glu Lys Asp His lie Leu Pro Glu Asn Gin Cys Cys Arg Val 
375 380 385 

TGT AGA GGT CAT AAC TTT TGT GCA GAA GGA OCT AAA TGT GGT GAA AAC 1314 
Cys Arg Gly His Asn Rie Ala Glu Gly Pro Lys Cys Gly Glu Asn 
390 395 400 

TCA GAG TGC AAA AAC TGG AAT ACA AAA GOT ACT TGT GAG TGC AAG AGT 1362 
Ser Glu Cys Lys Asn Trp Asn Thr Lys Ala Thr Cys Glu Cys Lys Ser 
405 410 415 420 

GGT TAC ATC TCT GTC CAG GGA GAC TCT GCC TAG TGT GAA CAT ATT GAT 1410 
Gly Tyr He Ser Val Gin Gly Asp Ser Ala Tyr Cys Glu Asp He Asp 
425 430 435 

GAG TGT GCA GCT AAG ATG CAT TAC TGT CAT GCC AAT ACT GTG TGT GTC 1458 
Glu Cys Ala Ala Lys Met: His Tyr Cys His Ala Asn Thr Val Cys Val 
440 445 450 

AAC CTT OCT GGG TTA TAT OGC TGT GAC TGT GTC OCA GGA TAC ATT OGT 1506 
Asn Leu Pro Gly Leu Tyr Arg Cys Asp Cys Val Pro Gly Tyr He Arg 
455 460 465 

GTG GAT GAC TTC TCT TGT ACA GAA CAC GAT GAA TGT GGC AGC GGC CAG 1554 
Val A^ Asp Phe Ser Cys Thr Glu His Asp Glu Cys Gly Ser Gly Gin 
470 475 480 

CAC AAC TGT GAT GAG AAT GCC ATC TGC AOC AAC ACT GTC CAG GGA CAC 1602 
His Asn Cys Asp Glu Asn Ala He Cys Thr Asn Thr Val Gin Gly His 
485 490 495 500 

AGC TGC AOC TGC AAA COG GGC TAC GTG GGG AAC GGG AOC ATC TGC AGA 1650 
Ser Cys Thr Cys Lys Pro Gly Tyr Val Gly Asn Gly Thr He Cys Arg 
505 510 515 

GCT TTC TGT GAA GAG GGC TGC AGA TAC GGT GGA AOG TGT GTG GCT OOC 1698 
Ala Phe C^s Glu Glu Gly Cys Arg Tyr Gly Gly Thr Cys Val Ala Pro 
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520 525 530 

AAC AAA TGT GTC TGT CCH TCT GGA TTC ACA GGA AGC CAC TGC GAC5 AAA 1746 
Asn Lys Cys Val Pro Ser 61y Hie Thr 61y Ser His Cys Glu Lys 
535 540 545 

GAT ATT GAT GAA TGT TCA GAG GGA ATC ATT GAG TGC CAC AAC CAT TOC 1794 
Asp lie Asp Glu Cys Ser Glu Gly lie lie Glu Cys His Asn His Ser 
550 555 560 

CGC TGC GTT AAC CT6 CCA. GGG TG6 TAC CAC TGT GAG TGC AGA AGC GGT 1842 
Arg Cys Val Asn Leu Pro Gly Trp Tyr His Cys Glu Cys Arg Ser Gly 
565 570 575 580 

TTC CAT GAC GAT GGG AOC TAT TCA GIG TOC GGG GAG TOC TGT ATT GAC 1890 
Fhe His Asp Asp Gly Thr Tyr Ser Leu Ser Gly Glu Ser lie Asp 
585 590 595 

ATT GAT GAA TGT GOC TTA AGA ACT CAC AOC TGT TGG AAC GAT TCT GOC 1938 
lie Asp Glu Cys Ala Leu Arg TOir His Thr Trp Asn Asp Ser Ala 
600 605 610 

TGC ATC AAC CTG GCA GGG GGT TTT GAC TGT CTC TOO OOC TCT GGG COC 1986 
lie Asn Leu Ala Gly Gly Phe Asp Cys Leu Cys Pro Ser Gly Pro 
615 620 625 : 

TOC TGC TCT GGT GAC TGT OCT CAT GAA GGG GGG CTG AAG CAC AAT GGC 2034 
Ser C^ Ser Gly Asp Pro His Glu Gly Gly Leu Lys His Asn Gly 
630 635 640 

CAS GTG TGG AOC TTG AAA GAA GAC AGG TGT TCT GTC TGC TOC TGC AAG 2082 
Gin Val Trp Itxr Leu Lys CSLu hsp Arg Qys Ser Val Cys Sesc Lys 
645 650 655 660 

GAT GGC AAG ATA TTC TGC OGA OGG ACA GCT TGT GAT TGC CAG AAT OCA 2130 
Asp Gly Lys lie Phe Cys A3?g Arg Thr Ala Asp Cys Gin Asn Pro 
665 670 675 

AGT GOT GAC OTA TTC TGT TGC OCA GAA TGT GAC AOC AGA GTC ACA AGT 2178 
Ser Ala Asp Leu Phe Cys Qys Pro Glu Cys Asp "Rir Arg Val Thr Ser 
680 685 690 

GAA TGT TTA GAC CAA AAT GGT CAC AAG CTG TAT OGA AGT GGA GAC AAT 2226 
Gin Cys Leu Asp Gin Asn Gly His Lys Leu Tyr Arg Ser Gly Asp Asn 
695 700 705 

TGG AOC CAT AGC TGT CAG CAG TGT OGG TGT CTG GAA GGA GAG GTA GAT 2274 
Trp Thr His Ser Cys Gin Gin Arg Cys Leu Glu Gly Glu Val Asp 
710 715 720 



-174- 



TGC TGG OCA CTC ACT TGC OCC AAC TTG AGC TGT GAG TAT ACA OCT ATC 2322 
Cys Trp Pro Leu Thr Cys Pro Asn Leu Ser Cys Glu Tyr Thr Ala lie 
725 730 735 740 

r 

TTA GAA GGG GAA TGT TGT CXX: CGC TGT GTC AGT GAG CCX: TGC OTA GCT 2370 
Leu Glu Gly Glu Cys Cys Pro Arg Oys Val Ser Asp Pro Cys Leu Ala 
745 750 755 

GAT AAC ATC AOC TAT GAC ATC AGA AAA ACT TGC CTG GAC AGC TAT GGT 2418 
Asp Asn lie Thr Tyr Asp He Arg Lys Thr C^ Leu Asp Ser Tyr Gly 
760 765 770 

GTT TCA OGG CTT AGT GGC TCA GTG TGG AOG ATG GCT GGA TCT OCC TGC 2466 
Val Ser Arg Leu Ser Gly Ser Val Trp Thr Met Ala Gly Ser Pro Cys 
775 780 785 

ACA AOC TGT AAA TGC AAG AAT GGA AGA GTC TGT TGT TCT GTG GAT TTT 2514 
Thr Thr Cys Lys C^ Lys Asn Gly Arg Val Cys Cys Ser Val TVsp Fhe 
790 795 800 

GAG TGT CTT CAA AAT AAT TGAAGTATTT ACAGTGGACT CAADQCAGAA 2562 
Glu C^ Leu Gin Asn Asn 
805 810 

GAATGGAOGA AATGAOCATC CAAOGTGATT AAGGATAGGA ATOQGTAGTT TGGTTTTTTT 2622 

GTTTGTTTTG TTTTTTTAAC CACAGATAAT TGOCAAAGTT TOCAOCTGAG GAOGGTGTTT 2682 

OGGAOGTTGC CTTTTGGAOC TAOCACTTTG CTCATTCTTG CTAAOCTAGT CTAGGTGADC 2742 

TACAGTGOOG TGCATTTAAG TCAATGGTTG TTAAAAGAAG TTTOOOGTGT TGTAAATCAT 2802 

GTTTOOCTTA TCAGATCATT TGCAAATACA TTTAAATGAT CTCATGGTAA ATGGTTGAT6 2862 

TATTTTTTGG GTTTATTTTG TGTACTAAOC AT7UVTAGAGA GAGACTCAGC TOCTTTTATT 2922 

TATTTTGTTG ATTTATGGAT CAAATTCTAA AATAAAGTTG OCTGTTGT6A CTTTT 2977 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 816 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
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Met Glu Ser Arg Val Leu Leu Arg Thr Phe C^s Leu lie Phe Gly Leu 
15 10 15 

Gly Ala Val Trp Gly Leu Gly Val Asp Pro Ser Leu Gin lie Asp Val 
20 25 30 

Leu Hur Glu Leu Glu Leu Gly Glu Ser Thr Thr Gly Val Arg Gin Val 
35 40 45 

Pro Gly Leu His Asn Gly Thr Lys Ala Phe Leu Phe Gin Asp Htw Pro 
50 55 60 

Arg Ser lie Lys Ala Ser Ttxr Ala Thr Ala Glu Gin Fhe Rie Gin Lys 
65 70 75 80 

Leu Arg Asn Lys His Glu Phe Hu: lie Leu Val Thr Leu Lys GLn Thr 

85 90 95 

His Leu Asn Ser Gly Val He Leu Ser He His His Leu Asp His Arg 
100 105 110 

Tyr Leu Glu Leu Glu Ser Ser Gly His Arg Asn Glu Val Arg Leu His 
115 120 125 

Tyr Arg Ser Gly Ser His Arg Pro His Thr Glu Val Phe Pro Tyr He 
130 135 140 

Leu Ala Asp Asp Lys Trp His Lys Leu Ser Leu Ala He Ser Ala Ser 
145 150 155 160 

His Leu lie Leu His He Asp Asn Lys He Tyr Glu Arg Val Val 
165 170 175 

Glu Lys Pro Ser Thr Asp Leu Pro Leu Gly Thr Thr Phe Trp Leu Gly 
180 . 185 190 

Gin Arg Asn Asn Ala His Gly Tyr Phe Lys Gly He Met: Gin Asp Val 
195 200 205 

Gin Leu Leu Val Met Pro Gin Gly Phe He Ala Gin Cys Pro Asp Leu 
210 215 220 

Asn Arg Thr Cys Pro Thr Cys Asn Asp Phe His Gly Leu Val Gin Lys 
225 230 235 240 

He Met; Glu Leu Gin Asp He Leu Ala Lys Thr Ser Ala Lys Leu Ser 
245 250 255 
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Arg Ala Glu Gin Arg Met Asn Arg Leu Asp Gin Cys Tyr Cys Glu Arg 
260 265 270 

Thr Thr Met: Lys Gly Thr Thr Tyr Arg Glu Phe Glu Ser Trp ,Ile 
275 280 285 

Asp Gly Oys Lys Asn Cys Thr Cys Leu Asn Gly Thr lie Gin Glu 
290 295 300 

Thr Leu lie Cys Pro Asn Pro Asp Pro Leu Lys Ser Ala Leu Ala 
305 310 . 315 320 

Tyr Val Asp Gly Lys Cys Cys Lys Glu Cys Lys Ser lie Cys Gin Phe 
325 330 335 

Gin Gly Arg Thr Tyr Phe Glu Gly Glu Arg Asn Thr Val Tyr Ser Ser 

340 345 350 

Ser Gly Val Val Leu Tyr Glu Cys Lys 7^ Gin Thr. Met Lys Leu 
355 360 365 

Val Glu Ser Ser Gly Cys Pro Ala Leu Asp Cys Pro Glu Ser His Gin 
370 375 380 

lie Thr Leu Ser His Ser Cys Oys Lys Val Cys Lys Gly Tyr Asp Phe 
385 390 395 400 

Cys Ser Glu Arg His Asn Cys Met Glu Asn Ser lie Cys Arg Asn Leu 
405 410 415 

Asn Asp Arg Ala Val Cys Ser Arg Asp Gly Phe 7\rg Ala Leii Arg 
420 425 430 

Glu Asp Asn Ala Tyr Cys Glu Asp lie Asp Glu Cys Ala Glu Gly Arg 
435 440 445 

His Tyr Cys Arg Glu Asn Thr Met Cys Val Asn Thr Pro Gly Ser Hie 
450 455 460 

Met Oys lie Lys Thr Gly Tyr lie Arg lie Asp Asp Tyr Ser Cys 
465 470 475 480 

Thr Glu His Asp Glu Cys lie Thr Asn Gin His Asn Cys Asp Glu Asn 
485 490 495 

Ala Leu Cys Phe Asn Thr Val Gly Gly His Asn Oys Val Cys Lys Pro 
500 505 510 

Gly Tyr Thr Gly Asn Gly Thr Thr Cys Lys Ala Phe Oys Lys Asp Gly 
515 520 525 
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Cys Arg Asn Gly Gly Ala lie Ala Ala Asn Val Cys Ala Cys Pro 
530 535 540 

Gin Gly Phe Thr Gly Pro Ser Cys Glu Thr Asp lie Asp Glu Ser 
545 550 555 560 

Asp Gly Phe Val Gin Qfs Asp Ser Arg Ala Asn cys lie Asn Leu Pro 
565 570 575 

Gly Trp Tyr His Cys Glu Arg Asp Gly Tyr His Asp Asn Gly Met 
580 585 590 

Fhe Ser Pro Ser Gly Glu Ser Cys Glu Asp lie Asp Glu Cys Gly Thr 
595 600 605 

GLy Arg His Ser Cys Ala Asn Asp Thr lie Cys Phe Asn Leu Asp Gly 
610 615 620 

Gly Tyr Asp Arg Cys Pro His Gly Lys Asn Cys thr Gly Asp Cys 
625 630 635 640 

lie His Asp Gly Lys Val Lys His Asn Gly Gin lie Trp Val Leu Glu 
645 650 655 

Asn Asp Arg Cys Ser Val Cys Ser Cys Gin Asn Gly Phe Val Met Oys 
660 665 670 

Arg Arg Met Val Cys Asp Cys Glu Asn Pro Thr Val Asp Leu Phe Cys 
675 680 685 

Cys Pro Glu Cys Asp Pro Arg Leu Ser Ser Gin Cys Leu His Gin Asn 
690 695 700 

Gly Glu Thr Leu Tyr Asn Ser Gly Asp Thr Trp Val Gin Asn Cys Gin 
705 710 715 - . 720 

Gin Cys Arg Cys Leu Gin Gly Glu Val TVsp Cys Trp Pro Leu Pro Cys 
725 730 735 

Pro Asp Val Glu Cys Glu Phe Ser He Leu Pro Glu Asn Glu Cys Cys 
740 745 750 

Pro Arg Cys Val Thr Asp Pro Cys Gin Ala Asp Thr He Arg Asn Asp 
755 760 765 

He Hir Lys Thr Cys Leu Asp Glu Met Asn Val Val Arg Phe Thr Gly 
770 775 780 

Ser Ser Trp He Lys His Gly Thr Glu Cys Thr Leu Cys Gin Lys 
785 790 795 800 
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Asn Gly His lie Cys Cys Ser Val Asp Pro Gin Cys Leu Gin Glu Leu 
805 810 815 

(2) INFORMATION FOR SEQ ID NO: 38: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2448 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOQY: linear 

(ii) MOLECULE TYPE: DNA( genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

ATGGAGTCTC GGGTCTTACT GAGAACATTC TGTTTGATCT TCGGTCTOGG AOCAGTTTGG 
GGGCTTGGTG TGGAOCCTTC OCTACAGATT GAOGTCTTAA CA6AGTTAGA ACTTOOOGAG 
TOCAOGAOOG GAC3?iG0GTCA GGTOCOGGQG CTGCATAATG GGAOGAAAGC CTTTCTCTTT 
CAAGATACTC OCAGAAGCAT AAAAGCATOC ACTGCTACAG CTGAACAGTT TTTTCAGAAG 
CTGAGAAATA AACAT6AATT TACTATTTTG GTGAOOCTAA AACAGAOGCA CTTAAATTCA 
GGAGTTATTC TCTCAATTCA OCACTTGGAT CACAGGTAOC TGGAACTGGA AAGTAGTOGC 
CATOGGAATG AAGTCAGACT GCATTAOOGC TCAGGCAGTC AOQGOOCTCA CACAGAAGTG 
TTTCCTTACA TTTTOGCT6A TGACAAGTGG CACAAGCTCT CCTTAGOCAT CAGTGCTTOC 
CATTTGATTT TACACATTGA CTGCAATAAA ATTTATGAAA OOGTAGTAGA AAAOOOCTOC 
ACAGACTTOC CTCTAGGCAC AACATTTTGG CTAGGACAGA GAAATAATGC GCATG6ATAT 
TTTAAGGGTA TAATGCAAGA TGTOCAATTA CTTGTCATGC OOCAGGGATT TATTGCTCAG 
TQOOCAGATC TTAATOGCAC CTGTOCAACT TGCAATGACT TOCATGGACT TGTGCAGAAA 
ATCATGGAGC TACAGGATAT TTTAGCCAAA ACATCAGOCA AGCTGTCTOG T^GCTGAACAG 
CGAATGAATA GATTGGATCA GTGCTATTGT GAAAGGACTT GCAOCATGAA OGGAAOCAOC 
TAOOGAGAAT TTGAGTOCTG GATAGAOGGC TGTAAGAACT GCACATGCCT GAATGGAAOC 
ATOCAGTGIG AAACTCTAAT CTGOCCAAAT OCTGACTGCC CACTTAAGTC GGGTCTTGOG 
TATGTGGATG GCAAATGCTG TAAGGAATGC AAATOGATAT GOCAATTTCA AGGAOGAAOC 
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TACTTTGAAG GAGAAAGAAA TACAGTCTAT TCCTCTTCTG GAGTATGTGT TCTCTATGAG 1080 

TOCAAOGAOC AGAOCATGAA ACTTGTTGAG AGTTCAGGCT GTOCAGCTTT GGATTOTOCA 1140 

GAGTCTCATC AGATAADCTT GTCTCACAQC TGTTGCAAAG TTTGTAAAGG TTATGACTTT 1200 

TGTTCTGAAA GGCATAACTG CATGGAGAAT TOCATCTGCA GAAATCTGAA TGACAGGGCT 1260 

GTTTOTAOCT GTOGAGAOXSG TTTTAOOOCT CTTOGAGAOG ATAATGOCTA CTGTGAAGAC 1320 

ATOGATGAGT GT0CIGAA06 OQGOCATTAC TGTOSTGAAA ATACAATGTG TGTCAACAOC 1380 

GOGOGTTCTT TTAIGTOCAT CTGCAAAACT OGATACATCA GAATTGATGA TTATTCATGrr 1440 

ACAGAACA1G ATGAGTGTAT CACAAATCAG CACAACTGTG ATGAAAATGC TTTATGCTTC 1500 

AACACTCTTG GAOGACACAA CTGTGTTTOC AAOOGOOOCT ATACAOOGAA TGGAAOGACA 1560 

TOCAAAGCAT TTTOCAAAGA T0GCTGTAO6 AATG6AGGA6 GCIGTATTOC OGCTAATGTG 1620 

TGTGCXJTGOC CACAAGGCTT CACTGGACOC AGGTGTGAAA C30C3ACATT6A TGAATGCTCT 1680 

GATGGTTTTG TTCAATGTGA CAGTOGTGCT AATTGCATTA AOCTGCXJrGG ATGGTAOCAC 1740 

TGTGAGTOCA GAGATGOCTA OCATGACAAT GGGATGTTTT CAOCAAGTG6 AGAATCX?IGT 1800 

GAAGATATTG ATGAGTGTOG GACX3GOGAG6 CACAGCTGI6 OCAATGATAC CATTTGCTTC 1860 
AATTT06ATG GCX3GATATGA TTGTOGATGT OCTCATGGAA AGAATTGCAC AGGOGACTOC 1920 
ATOCATGATG GAAAAGTTAA GCACAATGGT CAGATTTQGG TGTTGGAAT^JTGACAGGTGC 1980 
TCTGTGrrGCT CATGTCAGAA TOGATTOGIT ATGTGTOGAC GGATGGTCTG TGACTOTGAG 2040 
AAT0CXACA6 TTGATCTTTT TTGCTGOOCT GAATGTGAOC CAAGGCTTAG TAGTCAGTGC 2100 
CrOCATCAAA ATGGOGAAAC TTTGTATAAC AGTGGTGACA OC?rGGGTOCA GAATTGTCAA 2160 
CAGTGCXXSCr GCTTGGAAGG GGAAGITGAT TGTTGGOOOC TGOCTTGCXX: AGATGTGGAG 2220 
TGTGAATTCA GCATTCTOOC AGAGAATGAG TGCTGOOOGC GCTGTGTCAC AGACXXTTTGC 2280 
CAGGCTGACA OCATCXXX»A TGACATCAOC AAGACTTGOC TGGAOGAAAT GAATGTGGTT 2340 
OGCTTCAOOG GGTOCTCTTG GATCAAACAT GGCACTGAGT GTACTCTCTG CX3VGTGCAAG 2400 
AATGGOCACA TCTGTTGCTC AGTGGATOCA CAGTGCXnTC AGGAACTG 2448 



(2) INFORMATION FOR SEQ ID NO: 39: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3198 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single i- 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA( genomic) 



(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: .NO 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Human fetal brain cDNA library 

(B) CLONE: GEN-093E05 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 97.. 2544 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 



TTGGGAGGAG CAGTCTCTOC GCTOGTCTOC OGGAGCTTTC TOCATTGTCT CIXXX^TTTAC 60 

AACAGAOGGA GAOGATOGAC TGAGCTGATC OGCAOC ATG GAG TCT OGG GTC TTA 114 

Met: Glu Ser Arg Val Leu 
1 5 



CTG AGA ACA TTC TGT TTG ATC TTC GGT CTC GGA GCA GTT TGG GGG CTT 162 
Leu Arg Thr Phe Cys Leu lie Phe Gly Leu Gly Ala Val Trp Gly I^u 
10 15 20 

GGT GTG GAC OCT TOC CTA CAG ATT GAC GTC TTA ACA GAG TTA GAA CTT 210 
Gly Val Asp Pro Ser Leu Gin lie Asp Val Leu Thr Glu Leu Glu Leu 
25 30 35 

GGG GAG TOC AOG AOC GGA GTG CGT CAG GTC COG GGG CTG CAT AAT GGG 258 
Gly Glu Ser Thr Thr Gly Val Arg Gin Val Pro Gly Leu His Asn Gly 
40 45 50 

AOG 7^ GOC TTT CTC TTT CAA GAT ACT OOC AGA AGC ATA AAA GCA TOC 306 
Thr Lys Ala Phe Leu Phe Gin Asp Thr Pro Arg Ser lie Lys Ala Ser 
55 60 65 70 

ACT GCT ACA GCT GAA CAG TTT TTT CAG AAG CTG AGA AAT AAA CAT GAA 354 
Thr Ala Thr Ala Glu Gin Phe Phe Gin Lys Leu Arg Asn Lys His Glu 

75 80 85 



TTT ACT ATT TTG GTG AOC CTA AAA CAG AOC CAC TTA AAT TCA GGA GTT 



402 
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Phe Thr lie Leu Val Thr Leu Lys Gin Thr His Leu Asn Ser Gly Val 
90 ^ 95 100 

ATT CTC TCA ATT CAC CAC TTG GAT CAC AGG TAG CTG GAA CTG GAA AGT 450 
lie Leu Ser lie His His Leu Asp His Arg Tyr Leu Glu Leu Glu Ser 
105 110 115 

AGT GGC CAT OGG AAT GAA GTC AGA CTG CAT TAG OGC TCA GGC 7\GT CAC 498 
Ser Gly His Arg Asn Glu Val Arg Leu His Tyr Arg Ser Gly Ser His 
120 125 130 

OGC OCT CAC ACA GAA GTG TTT OCT TAG ATT TTG GOT GAT GAC AAG TGG 546 
Arg Pro His Thr Glu Val Phe Pro Tyr He Leu Ala 7^ Asp Lys Trp 
135 140 145 150 

CAC AAG CTC TOC TTA GOC ATC AGT GCT TOC CAT TTG ATT TTA CAC ATT 594 
His Lys Leu Ser Leu Ala lie Ser Ala Ser His Leu He Leu His He 
155 160 165 

GAC TGC AAT AAA ATT TAT GAA AGG GTA GTA GAA AAG OOC TOC ACA GAC 642 
Asp Cys Asn Lys He Tyr Glu Arg Val Val Glu Lys Pro Ser Thr Asp 
170 175 180 

TTG OCT OTA GGC ACA ACA TTT TGG CTA GGA CAG AGA AAT AAT GOG CAT 690 
Leu Pro Leu Gly Thr Thr Phe Trp Leu Gly Gin Arg Asn Asn Ala His 
185 190 195 

GGA TAT TTT AAG GGT ATA ATG CAA QAT GTC CAA TTA CTT GTC ATG OOC 738 
Gly Tyr Phe Lys Gly He Met Gin Asp Val Gin Leu Leu Val Met: Pro 
200 205 210 

CAG GGA TTT ATT GCT CAG TGC OCA GAT CTT AAT OGC AOC TGT oCA ACT 786 
Gin Gly Phe He Ala Gin C^ Pro Asp Leu Asn Arg Thr Cys Pro Thr 
215 220 225 230 

TGC i\AT GAC TTC CAT GGA CTT GTG CAG AAA ATC ATG GAG CTA CAG GAT 834 
Cys Asn Asp Phe His Gly Leu Val Gin Lys He Met Glu Leu Gin Asp 
235 240 245 

ATT TTA GOC AAA ACA TCA GOC AAG CTG TCT OGA GCT GAA CAG OGA ATG 882 
He Leu Ala Lys Thr Ser Ala Lys Leu Ser Arg Ala Glu Gin Arg Met 
250 255 260 

AAT AGA TTG GAT CAG TGC TAT TGT GAA AGG ACT TGC AOC ATG AAG GGA 930 
Asn Arg Leu 7^ Gin Cys Tyr Cys Glu Arg Thr Cys Thr Met Lys Gly 
265 270 275 

ACC AOC TAC CGA GAA TTT GAG TOC TGG ATA GAC GGC TGT AAG T^AC TGC 978 
Thr Thr Tyr Arg Glu Phe Glu Ser Trp He Asp Gly C^ Lys Asn Cys 
280 285 290 
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ACA TGC CTG AAT GGA AOC ATC CAG TGT GAA ACT CTA ATC TQC OCA AAT 1026 
Thr Cys Leu Asn Gly Thr lie Gin C^ Glu Thr leu lie Cys Pro Asn 
295 300 305 310 

r 

OCT GAC TGC OCA CTT AAG TOG GOT CTT GOG TAT GTG GAT GGC AAA TGC 1074 
Pro Asp Cys P2X> leu Lys Ser Ala Leu Ala Tyr Val Asp Gly Lys Cys 
315 320 325 

TCT AAG GAA TGC AAA TOG ATA TGC CAA TTT CAA GGA OGA ADC TAG TTT 1122 
Cys Lys Glu Cys Lys Ser lie Cys Glxi Phe Gin Gly 7^ Thr Tyr Phe 
330 335 340 

GAA GGA GAA AGA AAT ACA GTC TAT TOO TCT TCT GGA GTA TGT CTT CTC 1170 
Glu Gly Glu Arg Asn Thr Val Tyr Ser Ser Ser Gly Val Cys Val Leu 
345 350 355 

TAT GAG TGC AAG GAC CAG AOC ATG AAA CTT GTT GAG AGT TCA GGC TGT 1218 
Tyr Glu Cys Lys Asp Gin Thr Met: Lys Leu Val Glu Ser Ser Gly Cys 
360 365 370 

CCA GOT TTG GAT TGT OCA GAG TCT CAT CAG ATA ADC TTG TCT CAC AGO 1266 
Pro Ala Leu Asp Cys Pro Glu Ser His Gin He Thr Leu Ser His Ser 
375 380 385 390 

TOT TGC AAA GTT TGT /iAA GGT TAT GAC TTT TGT TCT GAA AGG GAT AAC 1314 
Cys Cys Lys Val Cys Lys Gly Tyr Asp Phe Cys Ser Glu Arg His Asn 
395 400 405 

TGC ATG GAG AAT TOO ATC TGC AGA AAT CTG AAT GAC AGG GOT GTT TGT 1362 
Cys Met Glu Asn Ser He Cys Arg TVsn Leu Asn Asp TVrg Ala Val Cys 
410 415 420 

AGO TGT OGA GAT GGT TTT AGG GOT CTT OGA GAG GAT AAT GOC TAC TGT 1410 
Ser Cys Arg Asp Gly Phe Arg Ala Leu Arg Glu Asp Asn Ala Tyr Cys 
425 430 435 

GAA GAC ATC GAT GAG TGT GOT GAA GGG OGC CAT TAC TGT OGT GAA AAT 1458 
Glu Asp He Asp Glu Cys Ala Glu Gly Arg His Tyr Cys Arg Glu Asn 
440 445 450 

ACA ATG TGT GTC AAC AOC OOG GGT TCT TTT ATG TGC ATC TGC AAA ACT 1506 
Thr Met Cys Val Asn Thr Pro Gly Ser Phe Met Cys He Cys Lys Thr 
455 460 465 470 

GGA TAC ATC AGA ATT GAT GAT TAT TCA TGT ACA GAA CAT GAT GAG TGT 1554 
Gly Tyr He Arg He Asp Asp Tyr Ser Cys Thr Glu His Asp Glu Cys 
475 480 485 

ATC ACA AAT CAG CAC AAC TGT GAT GAA AAT GOT TTA TGC TTC AAC ACT 1602 
He Thr Asn Gin His Asn Cys Asp Glu Asn Ala Leu Cys Phe Asn Thr 
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490 495 500 

GTT GGA GGA CAC AAC TGT GTT TGC AAG CXX5 GGC TAT ACA GOG AAT GGA 1650 
Val Gly Gly His Asn Val Cys Lys Pro Gly Tyr Thr Gly Asn Gly 
505 510 515 

AOS ACA TGC AAA GCA TTT TGC AAA GAT GGC TGT AGG AAT GGA GGA GOC 1698 
Thr Thr Cys Lys Ala Phe Cys Lys Asp Gly C^ Arg Asn Gly Gly Ala 
520 525 530 

TGT ATT GOC GCT AAT GTG.TGT GOC TGC OCA CAA GGC TTC ACT GGA OOC 1746 
Cys lie Ala Ala Asn Val Cys Ala Cys Pro Gin Gly Phe Thr Gly Pro 

540 545 550 



AGC TGT GAA AGG GAC ATT GAT GAA TGC TCT GAT GGT TTT GTT CAA TGT 1794 
Ser <^ Glu Thr Asp lie Asp Glu Cys Ser Asp Gly Phe Val Gin Cys 
555 560 565 

GAC AGT CGT GOT AAT TGC ATT AAC CIG OCT GGA TGG TAC CAC TGT GAG 1842 
Asp Ser Airg Ala Asn lie Asn Leu Pro Gly Trp Tyr His Glu 
570 575 580 

TGC AGA GAT GGC TAC CAT GAC AAT GGG ATG TTT TCA OCA AGT GGA GAA 1890 
Cys Arg Tlsp Gly Ty^: His Asp Asn Gly Met Phe Ser Pro Ser Gly Glu 
585 590 595 

TOG TGT GAA GAT ATT GAT GAG TGT GGG AOC GGG AGG CAC AGC TGT GOC 1938 
Ser Cys Glu Asp lie Asp Glu Cys Gly Thr Gly Arg His Ser Cys Ala 
600 605 610 

AAT GAT AOC ATT TGC TTC AAT TTG GAT GGC GGA TAT GAT TGT OGA TGT 1986 
Asn Asp Thr lie Cys Phe Asn Leu Asp Gly Gly Tyr Asp Cys-i?tog Cys 
615 620 625 630 

OCT CAT GGA A/US AAT TGC ACA GGG GAC TGC ATC CAT GAT GGA AAA GTT 2034 
Pro His Gly Lys Asn Cys Thr Gly Asp Cys lie His Asp Gly Lys Val 
635 640 645 

AAG CAC AAT GGT GAG ATT TGG GTG TTG GAA AAT GAC 7VGG TGC TCT GTG 2082 
Lys His Asn Gly Gin lie Trp Val Leu Glu Asn Asp Arg Ser Val 
650 655 660 

TGC TCA TGT CAG AAT GGA TTC GTT ATG TGT OGA OGG ATG GTC TGT GAC 2130 
Cys Ser Cys Gin Asn Gly Phe Val Met Cys Arg Arg Met Val Cys Asp 
665 670 675 

TGT GAG AAT OOC ACA GTT GAT CTT TTT TGC TGC OCT GAA TGT GAC OCA 2178 
Cys Glu Asn Pro Thr Val Asp Leu Phe Cys Cys Pro Glu Cys Asp Pro 
680 685 690 
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AGG CTT AGT 7\GT CAG TGC CTC CAT CAA AAT GGG GAA ACT TTG TAT AAC 2226 
Arg Leu Ser Ser Gin Cys Leu His Gin Asn Gly Glu Thr Leu Tyr Asn 
695 700 705 710 

AGT GGT GAC AOC TGG GTC CAG AAT TGT CAA CAG TGC OGC TGC TTG 2274 
Ser Gly Asp Thr Trp Val Gin Asn Cys Gin Gin Cys Arg Cys Leu Gin 
715 720 725 

GGG GAA GTT GAT TGT TGG COC CTG OCT TGC OCA GAT GT6 GAG TGT GAA 2322 
Gly Glu Val Asp C^ Trp Pro Leu Pro Cys Pro Asp Val Glu Cys Glu 
730 735 740 

TTC AGO ATT CTC OCA GAG AAT GAG TGC TGC COG OGC TGT GTC ACA GAC 2370 
Phe Ser lie Leu Pro Glu Asn Glu Cys Cys Pro Arg Cys Val Thr Asp 
745 750 755 

OCT TGC CAG GCT GAC ADC ATC OGC AAT GAC ATC AOC 7^ ACT TGC CTG 2418 
Pro Cys Gin Ala Asp Thr lie Arg Asp He Thr Lys Hhr leu 
760 765 770 

GAC GAA ATG AAT GTG GTT OGC TTC AOC GGG TOC TCT TGG ATC AAA CAT ' 2466 
Asp Glu Met Asn Val Val Arg Phe Thr Gly Ser Ser Trp He Lys His 
775 780 785 790 

GGC ACT GAG TGT ACT CTC TGC CAG TGC AAG AAT GGC CAC ATC TGT TGC 2514 
Gly Thr Glu Thr Leu Cys Gin Lys Asn Gly His He Cys Cys 
795 800 805 

TCA GTG GAT OCA CAG TGC CTT CAG GAA CTG TGAAGTTAAC TGTCTCATGG 2564 
Ser Val Asp Pro Gin Cys Leu Gin Glu Leu 
810 815 

GAGATTTCTG TTAAAAGAAT GTTCTTTCAT TAAATVGAOGA A7U\AGAAGTT AAAACTTAAA 2624 

TTQGGTGATT TGTGGGCAGC TAAATGCAGC TTTGTTAATA GCTGAGTGAA CTTTCAATTA 2684 

TGAAATTTGT GGAGCTTGAC AAAATCACAA AAGGAAT^TT ACTOOOOCAA AATTAGADCT 2744 

CAAGTCTGOC TCTACTGTGT CTCACATCAC CATGTAGAA6 AATGGGOGTA CAGTATATAC 2804 

OGTCACATOC TGAAOOCTGG ATAGAAAGCC TGAGOOCATT GGATCTGT6A T^AGOCTCTAG 2864 

CTTCACTGGT GCAGAAAATT TTCCTCTAGA TCAGAATCTT CAGAATCAGT TAGGTTOCTC 2924 

ACTGCAAGAA ATAAAATGTC AGGCAGTGAA TGAATTATAT TTTCAGAAGT AAAGCAAAGA 2984 

AGCTATAACA TGTTATGTAC AGTACACTCT GAA7UVGAAAT CTGAAACAAG TTATTGTAAT 3044 



GATAAAAATA ATGCACAGGC ATGGTTACTT AATATTTTCT AACAGGAAAA GTCATOOCTA 3104 
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TTTCICTTGTT TTACTGCACT TAATATTATT TGGTTGAATT TGTTCAGTAT AAGCTOGTTC 3164 
TTGTGCAAAA TTAAATAAAT ATTTCTCTTA CXTTT 3198 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 499 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 



Met Glu Leu See Glu Pro Val Val Glu Asn Gly Glu Val Glu Met Ala 
15 10 15 

Leu Glu Glu Ser Trp Glu HdLs Ser Lys Glu Val Ser Glu Ala Glu Fro 
20 25 30 

Gly Gly Gly Ser Ser Gly Asp Ser Gly Pro Pro Glu Glu Ser Gly Gin 
35 40 45 

Glu Met Met Glu Glu Lys Glu Glu lie Arg Lys Ser Lys Ser Val lie 
50 55 60 

Val Pro Ser Gly Ala Pro Lys Lys Glu His Val Asn Val Val Phe lie 
65 70 75 80 

Gly His Val A^ Ala Gly Lys Ser Tta: lie Gly Gly Gin lie Met Flie 

85 90 95 

Leu Hoc Gly Met Ala Asp Lys Arg Tka: Leu Glu Lys Tyr Glu Arg Glu 
100 105 110 

Ala Glu Glu Lys Asn Arg Glu Thr Txp Tyr Leu Ser Trp Ala. Leu Asp 
115 120 125 

Thr Asn Gin Glu Glu Arg Asp Lys Gly Lys Thr Val Glu Val Gly Arg 
130 135 140 

Ala Tyr Phe Glu Thr Glu Arg Lys His Phe Thr lie Leu Asp Ala Pro 
145 150 155 160 

Gly His Lys Ser Phe Val Pro Asn Met lie Gly Gly Ala Ser Gin Ala 
165 170 175 
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Asp Leu Ala Val Leu Val lie Ser Ala Arg Lys Gly Glu Phe Glu Thr 
180 185 190 

Gly Phe Glu Lys Gly Gly Gin Thr Arg Glu His Ala Met Phe Gly^Lys 
195 200 205 

Thr Ala Gly Val Lys His Leu lie Val Leu He Asn Lys Met Asp Asp 
210 215 220 

Pro Thr Val Asn Trp Gly He Glu Arg Tyr Glu Glu Cys Lys Glu Lys 
225 230. 235 240 

Leu Val Pro Phe Leu Lys Lys Val Gly Phe Ser Pro Lys Lys Asp He 
245 250 255 

His Phe Met Pro Cys Ser Gly Leu Thr Gly Ala Asn He Lys Glu Gin 
260 265 270 

Ser Asp Phe Pro Trp Tyr Thr Gly Leu Pro Phe He Pro Tyr Leu 
275 280 285 

Asn Asn Leu Pro Asn Phe Asn Arg Ser He Asp Gly Pro He Arg Leu 
290 295 300 

Pro He Val Asp Lys Tyr Lys Asp Met Gly Thr Val Val Leu Gly Lys 
305 310 315 320 

Leu Glu Ser Gly Ser He Phe Lys Gly Gin Gin Leu Val Met Met Pro 
325 330 335 

Asn Lys His Asn Val Glu Val Leu Gly He Leu Ser Asp Asp Thr Glu 
340 345 350 

Thr Asp Phe Val Ala Pro Gly Glu Asn Leu Lys He Arg-Leu Lys Gly 
355 360 365 

He Glu Glu Glu Glu He Leu Pro Glu Phe He Leu Cys Asp Pro Ser 
370 375 380 

Asn Leu Cys His Ser Gly Arg Thr Phe Asp Val Gin He Val He He 
385 390 395 400 

Glu His Lys Ser He He Cys Pro Gly Tyr Asn Ala Val Leu His He 
405 410 415 

His Thr Cys He Glu Glu Val Glu He Thr Ala Leu He Ser Leu Val 
420 425 430 

Asp Lys Lys Ser Gly Glu Lys Ser Lys Thr Arg Pro Arg Phe Val Lys 
435 440 445 
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Gln Asp Gin Val Cys lie Ala Arg Leu Arg Thr Ala Gly Thr lie Cys 
450 455 460 

Leu Glu Thr Phe Lys Asp Phe Pro Gin Met Gly Arg Phe Thr Leu Arg 
465 470 475 480 

Asp Glu Gly Lys Thr lie Ala lie Gly Lys Val Leu Lys Leu Val Pro 
485 490 495 

Glu Lys Asp 

(2) INFORMATION FOR SEQ ID NO: 41: 

(±) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1497 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA( genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

ATGGAACTTT CAGAAOCTGT TGTAGAAAAT GGAGAGGTGG AAATGGCOCT A6AAGAATCA 60 

TGGGAGCACA GTAAAGAAGT AAGTGAAGOC GAGOCTGGGG GTGGTTOCTC GGGAGATTCA 120 

GGGOOOOCAG AT^GAAAGTGG OCAGGAAATG ATGGAGGAAA AAGAGGAAAT AAGAAAATOC 180 

AAATCTGTGA TOGrTAOOCTC AGGTQCAOCT AAGAAAGATtf: Aa3TAAATGT*'k^ATTCATT 240 

OGOCATGTAG ADGCTGGCAA GTCAACCATC GGAGGACAGA TAATGTTTTT GACTGGAATG 300 

gctgaca;\aa GAACACTGGA GAAATATGAA AGAGAAGCTG AGGATVAAAAA CAGAGAAAOC 360 

TGGTATTTGT OCTGGGOCTT AGATACA7\AT CAGGAGGAAC GAGACAAGQG TAAAACAGTC 420 

GAAGTGGGTC GTGOCTATTT TGAAACAGAA AGGAAACATT TCACAATTTT TVGATGOOOCT 480 

GGOCACAT^ GTTTTGTGOC AAATATGATT GGTGGTGCTT CTCAAGCTGA TTTGGCTCTG 540 

CTGGTCATCT CTQOCAGGAA AGGAGAGTTT GAAACTGGAT TTGAAAT^AGG TGGACAGACA 600 

AGAGAACATG OGATGTTTGG CAAAAOGGCA GGAGTAAAAC ATTTAATAGT GCTTATTAAT 660 

AAGATGGATG ATOOCACAGT AAATTGGGGC ATOGAGAGAT ATGAAGAATG TAAAGAAAAA 720 

CTGGTGGOCT TTTTGAAAAA AGTAGGCTTT AGTCCAAAAA AGGACATTCA CTTTATOOOC 780 



-188- 

TGCTCAGGAC TGAOOGGAOC AAATATTAAA GAOCAGTCAG ATTTCTGOCX: TTGGTACACT 840 

GGATTACCAT TTATTOOGTA TTTGAATAAC TTGOCAAACT TCAACAGATC AATTGATGGA 900 

OCAATAAGAC TOtXAATTGT OGATAAGTAC AAAGATAT06 OCACTGTOCTr OCTO&AAAG 960 

CI06AAT00G GGTOCATTTT TAAAGGOCAG CAOCTOGTGA TGATGOCAAA CS^AGCACAAT 1020 

GTAGAAGTTC TTGGAATACT TTCTGATGAT ACT6AAACT6 ATTTTGTAGC CXXAOGTG^ 1080 

AACXZTCAAAA TCASACTGAA OQGAATTGAA GAAGAAGAGA TTCTTCXIAGA ATTCATAdT 1140 

TGTGATOCTA GTAACXTCTG CX»TTCTGGA CX3CAOGTTTG ATGTTCAGAT AGTGATTATT 1200 

GAOCACAAAT OCATCATCT6 CXXAGGriTAT AAT0CX3GTGC TOCACATTCA TACTTGTATT 1260 

GAO6AA0ITG AGATAACAGC GTTAATCTOC TTGGTAGACA AAAAATCAG6 GGAAAAAACST 1320 

AAGACAOGftC OO0GCTTCX3T GAAACAAGAT CAAC3TATGCA TTGC3T0GTTT AAGGACAGCA 1380 

G6AAOCATCT OGCTCX3AGAC GTTCAAAGAT TTTOCXCAGA TOOGIOSTTT TACTTTAAGA 1440 

GATGAOOCTA AGACTATTGC AATTOGAAAA GTTCTGAAAT TOGTOOCAGA GAAOGAC 1497 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2057 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA( genomic) 
(111) HYPOTHETICAL: NO 
(Iv) ANTI-SENSE: NO 

(Vll) IMMEDIATE SOURCE: 

(A) LIBRARY: Human £et:al brain cDNA library 

(B) CLONE: GEN-077A09 

(Ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 144.. 1640 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
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TOOOGGCOGG CKXX3GCAGC AAOGATGAAG CCTGCPJXXSG (DGOGGGATAC OCTCAAGGTA 60 

AAAOGATGGG ACX30GGGGCA (XTGTGGAAC CTTOOOGAGA OGAAOOGITA GTGT0GCTT6 120 

AAOGTTOCAA TTCAGCX3CTT AOC ATG GAA CTT TCA GAA CXT GTT GTA GAA 170 

Met Glu Leu Ser Glu Pxx> Val Val Glu 
1 5 

AAT OGA GAG GTG GAA ATG GOC CTA GAA GAA TCA TGG GAG CAC AGT AAA 218 
Asn Gly Glu Val Glu Met Ala Leu Glu Glu Ser Trp Glu His Ser Lys 
10 15 20 25 

GAA GTA AGT GAA GOC GAG OCT GGG GGT GGT TOO TOG GGA GAT TCA GGG 266 
Glu Val Ser Glu Ala Glu Pro Gly Gly Gly Ser Ser Gly Asp Ser Gly 

30 35 40 

OOC OCA GAA GAA AGT OGC OAG GAA ATG ATG GAG GAA AAA GAG GAA ATA 314 
Pro Pro Glu GLu Ser Gly Gin Glu Met Met Glu Glu Lys Glu Glu He 
45 50 55 

AGA AAA TOO AAA TOT GTG ATC GTA OOC TCA GGT GGA OCT AAG AAA GAA 362 
Arg Lys Ser Lys Ser Val lie Val Pro Ser Gly Ala Pro Lys Lys Glu 
60 65 70 

CAC GTA AAT GTA GTA TTC ATT GGC CAT GTA GAC GCT GGC AAG TCA AOC 410 
His Val Asn Val Val Phe He Gly His Val Asp Ala Gly Lys Ser Thr 
75 80 85 

ATC GGA GGA OAG ATA ATG TTT TTG ACT GGA ATG GCT GAC AAA AGA ACA 458 
lie Gly Gly Gin lie Met Phe Leu Thr Gly Met Ala Asp Lys Arg Thr 
90 95 100 105 

CTG GAG MJi TAT GAA AGA GAA GCT GAG GAA AAA AAC AGA GAA AOC TGG . 506 
Leu Glu Lys Tyr Glu Arg Glu Ala Glu Glu Lys Asn Arg Glu Thr Trp 
110 115 . 120 

TAT TTG TOC TGG GGC TTA GAT ACA AAT CAG GAG GAA GGA GAC AAG GGT 554 
Tyr Leu Ser Trp Ala Leu Asp Thr Asn Gin Glu Glu Arg Asp Lys Gly 
125 130 135 

AAA ACA GTC GAA GTG GGT OCT GOC TAT TTT GAA ACA GAA AGG AAA CAT 602 
Lys Thr Val Glu Val Gly Arg Ala Tyr Phe Glu Thr Glu Arg Lys His 
140 145 150 

TTC ACA ATT TTA GAT GOC OCT GGC CAC AAG AGT TTT GTC OCA AAT ATG 650 
Phe Thr lie Leu Asp Ala Pro Gly His Lys Ser Phe Val Pro Asn Met 
155 160 165 



ATT GCT GCT GCT TCT CAA GCT GAT TTG GCT GTG CTG GTC ATC TCT GOC 
lie Gly Gly Ala Ser Gin iMa Asp Leu Ala Val Leu Val lie Ser Ala 



698 
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170 175 180 185 

AGG AAA GGA GAG TTT GAA ACT GGA TTT GAA AAA GGT GGA CAG ACA AGA 746 
Arg Lys Gly Glu Phe Glu Thr Gly Phe Glu Lys Gly Gly Gin Thr,-Arg 
190 195 200 

GAA CAT GCG ATG TTT GGC AAA AOG GCA GGA GTA AAA CAT TTA ATA GTO 794 
Glu His Ala Met Phe Gly Lys Thr Ala Gly Val Lys His Leu lie Val 
205 210 215 

CTT ATT AAT AAG ATG GAT. GAT CCC ACA GTA AAT TGG GGC ATC GAG AGA 842 
Ifiu lie Asn Lys Met Asp Asp Pro Thr Val Asn Trp Gly He Glu Am 
220 225 230 

TAT GAA GAA TGT AAA GAA AAA CTG GIG COC TTT TTG AAA AAA GTA GGC 890 
Tyr Glu Glu lys Glu Lys Leu Val Pro Phe Leu Lys Lys Val Gly 
235 240 245 

TTT AGT CCA AAA AAG GAC ATT CAC TTT ATG COC TGC TCA GGA CTG AOC 938 
Phe Ser Pro Lys Lys Asp lie His Phe Met Pro Cys Ser Gly Leu Thr 
250 255 260 265 

GGA GCA AAT ATT AAA GAG CAG TCA GAT TTC TGC CCT TGG TAG ACT GGA 986 
Gly Ala Asn He Lys Glu Gin Ser Asp Phe Pro Trp Tyr Thr Gly 
270 275 280 

TTA CCA TTT ATT OCG TAT TTG AAT AAC TTG OCA AAC TTC AAC AGA TCA 1034 
I«u Pro Phe He Pro Tyr Leu Asn Asn Leu Pro Asn Phe Asn Arg Ser 
285 290 295 

ATT GAT GGA OCA ATA AGA CTG OCA ATT GT6 GAT AAG TAC AAA GAT ATG 1082 
He Asp Gly Pro He Arg Leu Pro He Val Asp Lys Tyr Lys Asp Met 
300 305 310 

GGC ACT GTG GTC CTG GGA AAG CTG GAA TOO GGG TOO ATT TTT AAA GGC 1130 
Gly Thr Val Val Leu Gly Lys Leu Glu Ser Gly Ser He Rie Lys Gly 
315 320 325 

CAG CAG ore GTG ATG ATG OCA AAC AAG CAC AAT GTA GAA GTT CTT GGA 1178 
Gin Gin Leu Val Met Met Pro Asn Lys' His Asn Val Glu Val Leu Gly 
330 335 340 345 

ATA CTT TCT GAT GAT ACT GAA ACT GAT TTT GTA GCC CCA GGT GAA AAC 1226 
He teu Ser Asp Asp Hxr Glu Thr Asp Rie Val Ala Pro Gly Glu Asn 
350 355 360 

ore AAA ATC AGA CTG AAG GGA ATT GAA GAA GAA GAG ATT CTT OCA GAA 1274 
leu Lys He Arg Leu Lys Gly He Glu Glu Glu Glu He Leu Pro Glu 
365 370 375 



-191- 

TTC ATA CTT TGT GAT OCT AGT AAC CTC TGC CAT TOT GGA OGC AOG TTT 1322 
Phe lie Leu Cys Asp Pro Ser Asn Leu Cys His Ser Gly Arg Thr Phe 
380 385 390 

GAT GTT CAG ATA GTG ATT ATT GAG CAC AAA TCC ATC ATC TGC OCA GGT 1370 
Asp Val Gin lie Val lie lie Glu His Lys Ser lie lie Cys Pro Gly 
395 400 405 

TAT AAT GCG GTG CTG CAC ATT CAT ACT TGT ATT GAG GAA GTT GAG ATA 1418 
Tyr Asn Ala Val Leu His lie His Thr Cys lie Glu Glu Val Glu lie 
410 415. 420 425 

ACA GOG TTA ATC TOO TTG GTA GAC AAA AAA TCA GGG GAA AAA AGT AAG 1466 
Thr Ala Leu lie Ser Leu Val Asp Lys Lys Ser Gly Glu Lys Ser Lys 
430 435 440 

ACA GGA OOC OGC TTC GTG AAA CAA GAT CAA GTA TGC ATT GOT CGT TTA 1514 
Thr Arg Pro Arg Phe Val Lys Gin Asp Gin Val Cys He Ala Arg Leu 
445 450 455 

AGG ACA GCA GGA AOC ATC TGC CTC GAG AOG TTC AAA GAT TTT OCT CAG 1562 
Arg Thr Ala Gly Thr He Cys Leu Glu Thr Phe Lys Asp Phe Pro Gin 
460 465 470 

ATG GGT OGT TTT ACT TTA AGA GAT GAG GGT AAG AOC ATT GCA ATT GGA 1610 
Met Gly Arg Phe Thr Leu Arg Asp Glu Gly Lys Thr He Ala lie Gly 
475 480 485 

AAA GTT CTG AAA TTG GTC OCA GAG AAG GAC TAAGCAATTT TCTTGATGOC 1660 
Lys Val Leu Lys Leu Val Pro Glu Lys Asp 
490 495 

TCTGCAAGAT ACTGTGAGGA GAATTGACA6 CAAAAGTTCA CCAOCTACTC TTATTTACTG 1720 

CXXIATTGATT GACTTTTCIT CATATTTTGC AAAGAGAAAT TTCACAGCAA. AAATTCATGT 1780 

TTTGTCAOCT TTCTCATGTT GAGATCTC3TT ATGTCACTGA TGAATTTAOC CTCAAGTTTC 1840 

CTTOCTCTGT AOCACTCTGC TTOCTTGGAC AATATCAGTA ATAGCTTTGT AAGTGATGTC 1900 

GAOSTAATTG CCTACAGTAA TAAAAAAATA ATGTACTTTA ATTTTTCATT TTCTTTTAGG 1960 

ATATTT/VGAC CAaXTTGTT CX3V0GCAAAC CAGAGTGTGT CAGTGTTTGT GTGTGTGTTA 2020 

AAATGATAAC TAACATCTGA ATAAAATACT OCATTTG 2057 



